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PREFACE TO THE SECOND RUSSIAN EDITION 


For the second edition the book has been revised. The presentation 
of the method of iteration is now based on the concept of the contrac¬ 
tion mapping, as it is possible to consider the latter before introducing 
the concept of the derivative. The part of the book dealing with the 
approximate solution of systems of equations has been substantially 
enlarged. Lastly, all problems have been provided with solutions. 



PREFACE TO THE FIRST RUSSIAN EDITION 


The main purpose of this book is to present various methods of 
approximate solution of equations. Their practical value is beyond 
doubt, but still little attention is paid to them either at school or a col¬ 
lege and so someone who has passed a college level higher mathema¬ 
tics course usually has difficulty in solving a transcendental equation 
of the simplest type. Not only engineers need to solve equations, but 
also technicians, production technologists and people in other profes¬ 
sions as well. It is also good for high-school students to become 
acquainted with the methods of approximate solution of equations. 

Since most approximate solution methods involve the idea of the 
derivative we were forced to introduce this concept. We did this intui¬ 
tively, making use of a geometric interpretation. Hence, a knowledge 
of secondary school mathematics will be sufficient for anyone wanting 
to read this book. 

In writing this book the author made use of a lecture he delivered to 
9th and 10th form pupils, members of the school mathematics circle at 
the Lomonosov State University of Moscow 

The material contained in this lecture was used by a teacher at the 
Moscow secondary school No. 425, S. I. Schwartzburd, for extracurri¬ 
cular work with the 9th form pupils. The author expresses his grati¬ 
tude to S, I. Schwartzburd for supplying problems involving the solu¬ 
tion of equations by the method of iteration. These problems were 
made use of in the writing of the book. 

The author expresses his profound gratitude to V. G. Boltyansky 
whose remarks were very helpful in improving the original 
manuscript. 



1. Introduction 


In studying mathematics at school much time is spent on solving 
equations and systems of equations Initially equations of the first 
degree and systems of such equations are studied. Then come quadra¬ 
tic, biquadratic and irrational equations. Finally, the pupil becomes 
acquainted with exponential, logarithmic and trigonometric equa¬ 
tions 

It is not by chance that so much attention is paid to equations. The 
reason is the importance of equations in the practical applications of 
mathematics. In whatever field of application you choose you will 
have to solve equations, or systems of equations to arrive at a final 
answer 

At school, equations are often used in solving physics problems 
Consider, for instance, the following problem. 

A stone is thrown into a well. Find the depth of the well if the sound of 
the stone striking the water is heard Tseconds after it has been dropped. 

If we denote the depth of the well by x then to find x we obtain the 
equation 



= T 


where v is the sound velocity in air (\flx/g is the time the stone takes 
to fall, and x/v is the time the sound of the stone striking the water 
takes to reach us). This is an irrational equation. Putting y x = y we 
reduce it to a quadratic equation 



which may be solved using the well-known formula 

Equations are used to solve geometric problems, as well. For in¬ 
stance, the problem of dividing an interval AB of length / into intervals 
AC and CB such that AB : AC = AC : CB leads to a quadratic 
equation 

x 2 + lx-l 2 = 0 

where x denotes the length of the interval AC. 

The problem of dividing the angle a into three parts leads to a more 
complex equation. This equation is of the form 

4x 3 — 3x — cos a = 0 

where x = cos a/3 Such equations, called cubic equations , are not stu- 
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died at school, but any course in higher algebra contains the proof 
that there is a formula for the solution of such equations [see formula 
(3) below]. 

However, in physics we often come up against problems which lead 
to more complex equations, whose solutions are not given either at 
school, or at university. Take, for instance, an iron bar (the engineers 
would call it a beam) and fix its ends rigidly. If we strike at the bar, 
transverse oscillations are generated in it. Mathematical physics 
shows that to find the frequency of such oscillations the equation 

2 

e x + e x = - (1) 

cosx 

should be solved, where e — 2.71828... . 

At school no rules are given for the solution of such equations. Do 
not think that this is due to the brevity of the school mathematics cur¬ 
riculum. There is no formula at all for the solution of equation (1) in 
the sense usually accepted at school. Lefs make a more precise 
statement. 

An equation is said to have a solution formula if its roots can be 
expressed in terms of the parameters of the equation with the aid of 
the arithmetical operations, the extraction of roots and the exponen¬ 
tial, logarithmic, trigonometric and inverse trigonometric functions. 
In this sense the quadratic equation x 2 4- px 4- q = 0 has a solution 
formula of the form 



There is a formula for the solution of the cubic equation* 
x 3 + px + q = 0 

as well. It is of the form 

3 3 



However, the use of formula (3) in practice involves a number of diffi¬ 
culties and requires the use of complex numbers. 

There is also a formula for the solution of equations of the fourth 
degree, but it is so complicated that we shall not give it here. 


*With the aid of the substitution x + aJ3a^=y any cubic equation 
u 0 x 3 -l- a l x 2 + a 2 x + a 3 — 0 may be reduced to the above form 
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The situation with the equations of the fifth and higher degree is 
even worse. The Norwegian mathematician Niels Abel proved in 1826 
that for n ^ 5 there is no formula for the solution of the algebraic 
equation 

a 0 x n + a { x n ~ 1 + ... + a n = 0 

with the aid of arithmetical operations and extraction of roots. Only 
for particular cases for algebraic equations of a degree higher than the 
fourth degree are there solution formulas*. 

If mathematicians limited their studies to equations having exact 
solutions, i. e. solutions expressed by some formula, a conversation 
between an engineer and a mathematician would take the following 
form. 

Engineer When designing a structure I arrived at this equation (shows the 
equation) I must have a solution quickly — in a month's time I must finish the 
project 

Mathematician I would gladly help you, but there is no solution for an 
equation of this type 

Engineer Couldn’t you derive the formula 9 

Mathematician It’s no good trying It has been proved long ago that there 
is no formula for the solution of such equations 

One could imagine that after such a conversation the 
engineer s opinion of mathematics and its possibilities would change 
for the worse Happily, such conversations do not take place. Actually, 
the engineer usually has no need of a formula for the solution of this 
or that equation. What he needs is an answer with a certain degree of 
accuracy — whether the answer was obtained from a formula, or by 
some other means, is not of much interest to him. 

Imagine, for instance, that the formula has been found and that the 
answer obtained from it is x = 3 4- |/13. Clearly, this answer cannot be 
directly used in practice [one can hardly ask a mechanic to make 
a part (3 + |/13) cm long]. For practical purposes |/13 should be 
expressed in decimals and as many digits after decimal point should 
be taken as are required for the given practical problem. 

Hence, the engineer will be quite satisfied if the mathematician tells 
him how to calculate the roots of the equation with the necessary 
degree of accuracy. Mathematics has developed a number of methods 
for the approximate solution of equations. Some of them are described 
in this book. 


*On algebraic equations sec A G Kurosh, Algebraic Equations of Arbitr¬ 
ary Degrees , Mir Publishers, Moscow, 1977 
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2. Successive Approximations 

Most methods of approximate solutions of equations are based on 
the idea of successive approximations . This idea is used not only to 
solve equations, but to solve a number of practical problems, as well. 

The method of successive approximations, or the trial and error 
method, is used by gunners. In order to hit a target they set the azi¬ 
muth scale and the sight and fire the gun. If the target is missed, the 
setting of the azimuth scale and of the sight is corrected in accordance 
with the observed position of the shell’s explosion, and the next round 
is fired. After several approximations they are able to set the azimuth 
scale and the sight so as to hit the target 

A 0 A! A 3 

-O 0—00 

a 2 


0 

1 ig 1 

Sometimes successive approximations are needed also to determine 
the aiming point. Suppose, an anti-aircraft gun at point 0 fires at an 
aircraft in flight (Fig. 1). If the gun is aimed at point A 0 where the plane 
is at that moment, it will miss, for the plane will move to another point 
A x while the shell travels. If one knows the velocities of the plane and 
of the shell one may find this point A x comparatively easily. However, 
if the shell is aimed at point A t the target may still be missed This is 
because an inclination of the gun barrel changes the path of the shell's 
motion and therefore the time taken by the shell to cover the distance 
OA 0 is not the same as that needed to cover the distance 0A X and as 
a result the shell will not hit the plane. But in the latter case the miss 
will not be as great as when the piece is aimed at point A 0 . To make it 
still less one should find the time taken by the shell to cover the dis¬ 
tance CMj, as well as the point reached by the plane in this time. This 
point A 2 will be the next approximation for the aiming point sought. 
After that we shall have to find the time the shell takes to reach point 
A 2 and calculate the point /1 3 , which the plane will reach after that 
time 
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After several approximations we shall find the aiming point with 
the necessary degree of accuracy. 

The method of successive approximations is also used to solve many 
other problems. 

Suppose one has to transport sand from several sand-pits A x . A n 

to several building sites B x . B m . Suppose the productivity of the pit 

Aj is <3/ tons per day, and the amount of sand required by the site Bk is 
b k tons per day. Finally, let the cost of transporting a ton of sand from 
the pit Aj to the site B k be c jk (this quantity depends on the distance 
between A } and B k , on the state of the roads, etc). 

To prepare a transportation plan let us compile Table 1 In this 
table x jk denotes the amount of sand transported from the pit A } to the 
site B k 


Table 1 



S, 

B 2 



At 

*11 

*12 


*!«. 

a 2 

*21 

*22 


*2« 






A n 

*«1 

*n2 


*nm 


The numbers x jk should of course satisfy the following relations: 

*/l+*j2 + •• + Xjm ^ <*j 

(the amount of sand transported from the pit Aj per day should not 
exceed cij tons), 

X U + *2*+ + X nk ~ bk 

(the site B k should receive b k tons of sand per day). 

If the plan given in Table 1 is adopted, the cost of the transportation 
of sand will be 

^ = +c 12 x 12 + ... +c ln x ln + 

+ ^21*21 + ^22*22 + ••• +C 2n X 2n + 


+ C m\ X m\ + C m2 X m2 + 


+ 


(4) 
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The plan should be such that the cost c is the smallest possible. To 
begin with, a tentative plan is devised. For instance, the following 
method may be used. 

The pit A x is made the contractor for the site nearest to it. If the 
sand production of this pit exceeds the requirements of that site it is 
made to supply another site, the nearest to it of all the remaining sites. 
After several steps the productivity of the pit A x will be exhausted. 
After that the pit A 2 is made the contractor for the nearest of the 
remaining sites, etc. Eventually every site shall have its contractor pit. 

However, a plan devised in this way is not the best one because in 
the end only a few building sites will remain and they may be quite 
distant from the remaining pits. So the plan will have to be revised. 
Some of the contracts between the sites and the pits with smaller 
numbers will have to be cancelled and the sites supplied by pits with 
greater numbers. 

The methods of changing the plan, leading to a reduction in trans¬ 
portation costs, are considered in a branch of mathematics called 
linear programming*. 

After several approximations made with the aid of these methods 
we shall arrive at a plan for which the sum (4) is a minimum, or close 
to a minimum. 

In general, when devising a plan, a timetable, etc. the practice is to 
begin with some rough approximation which is subsequently im¬ 
proved successively until finally the required result is obtained. 

The machining of some part in a factory work shop may also be con¬ 
sidered to be a process of successive approximation to the desired 
shape. At first some rough approximation—a casting or a blank is 
taken. This blank is machined on a lathe to obtain a shape close to 
that of the part being fabricated. After that it is passed over to a more 
accurate lathe. The required part is produced after several such 
machining operations, i.e. after several approximations. 


3. Achilles and the Tortoise 

The first to mention successive approximations was Zeno of Elea 
who lived c. 500 B. C. This philosopher tried to prove that there was 
no motion in nature. The following reasoning was used by Zeno to 
prove the absence of motion: if the fastest Greek runner Achilles were 

On linear programming see A S Solodovnikov “Introduction to linear 
algebra and linear programming ”, Prosveshcheniye, Moscow, 1966 
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to try to catch up with a tortoise, he would be unable to. Indeed, sup¬ 
pose the distance between Achilles and the tortoise is 1000 steps, and 
in one second Achilles runs 10 steps, while the tortoise crawls one step. 
In 100 seconds Achilles will run 1000 steps separating him from the 
tortoise. But during this time the tortoise will crawl 100 steps. In 10 
seconds Achilles will run 100 steps, but the tortoise will crawl a further 
10 steps. To cover this distance Achilles will need another second dur¬ 
ing which the tortoise will move one step further. Hence, the tortoise 
will always be in front of Achilles and he will never be able to catch up 
with it. Consequently, there is no motion. 

Of course, such reasoning by Zeno is only a witty paradox and 
nothing more. Motion is an intrinsic property of matter. 

Any pupil will have no difficulty in calculating when Achilles 
catches up with the tortoise. To do this one should formulate an 
equation 


lOx — x = 1000 (5) 

where x is the time sought. From this equation we obtain 


where V denotes seconds. 

However, Zeno's reasoning may be regarded as a particular method 
of approximate solution of equation (5). 

Indeed, transpose x into the right-hand side of the equation and 
divide both sides by 10. We shall obtain the equation 

x = 100 + — (6) 

10 

If in the right-hand side we neglect the term x/10 (it is small compared 
with x) we obtain for x the approximate solution = 100. Now we 
can make the answer more precise by substituting for x in the right- 
hand side the approximation obtained, x t = 100. We shall obtain 
a more accurate value for x, i. e. x 2 = 100 +10=110. Substituting the 
new value into the right-hand side of the equation we find the next 
approximation x 3 = 100+ 110/10= 111. In this way we obtain the 
approximations 

x t = 100, x 2 = 110, x 3 = 111, x 4 = 111.1, ... 


i.e. the same numbers that followed from Zeno's reasoning. These 
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numbers are connected by the following relationship 

x — 100 + —— (7) 

n+1 10 ' 

which enables successive calculations of them to be made. As n in¬ 
creases they approach the exact solution x = 111 — of equation (5). 

The method of solution described above proved successful because 
the term x/10 was small in comparison with x. Otherwise we would 
have obtained numbers which did not get closer and closer to the 
solution sought Suppose, for instance, that Achilles took on not 
a slow tortoise, but a light-footed antelope, which runs 20 steps per 
second. To find the time it will take Achilles to catch up with the ante¬ 
lope we must solve the equation 

10x~20x = 1000 (8) 

Its solution is x = — 100. This means that Achilles and the antelope 
ran neck and neck 100 seconds ago, and that now the antelope has 
overtaken Achilles, the distance separating them growing with time. 

Let us try to solve equation (8) using the same method we used to 
solve equation (5). To do this transpose the term 20x into the right- 
hand side and divide both sides of the equation by 10. We obtain the 
equation 


x = 100 + 2x (9) 

Put x 0 = 0 in the right-hand side We find that x, = 100. Substituting 
this value into the right-hand side of equation (9) we obtain the next 
approximation x 2 = 300. Continuing the process we obtain the 
numbers 

x 0 = 0, Xi = 100, x 2 = 300, x 3 = 700, ... 

We see that the numbers do not approach the exact solution x = 
= — 100 of equation (8). 

4. Division on Electronic Computers 

The reader may perhaps be puzzled: why solve equation (5) by the 
method of successive approximations when it is simple to solve it 
exactly. But of course equation (5) itself was of little interest to 
us—we were interested in the method of successive approximations 
which we intend to apply to more complex equations. 
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By the way, it is worth mentioning that with the appearance of high 
speed electronic computers it has become necessary fairly recently to 
solve equations similar to Eq. (5) by means of the method of successive 
approximations. Some computers can perform only three arithmetical 
operations: addition, subtraction and multiplication. Apart from that 
they can divide by numbers of the form 2 n . What is the method such 
computers use to divide by arbitrary numbers? 

The division of the number b by the number a entails the solution of 
the equation ax = b. Since the computer can multiply and divide by 2 n 
we may assume that 1/2 ^ a < 1 (otherwise we can multiply, or divide, 
both sides of the equation ax = b by the number 2 raised to the appro¬ 
priate power). Rewrite the equation ax = b in the form 

x = (l — a)x + b ( 10 ) 

Let x x = b be the first approximation for x. Denote the error of this 
approximation by i. e. suppose that Xj + = b/a . Then we obtain 

from equation (10) 

x x + a i = (1 — + a i) + b = 

= (1 — a)x x + & + (1 - a)ot x ( 11 ) 

Since 1/2 ^a<\ it follows that 

0 < 1 — a ^ 1/2 

The factor 1 — a being comparatively small we discard the term (1 — 
— a)a x in the right-hand side of equation (11) which is not greater 
than oti/2. We obtain 

x x + <Xi « (1 — a)x x + b 


The number 

will be taken as the next approximation for x. 

Denote the error of the approximation x 2 by a 2 , i. e. put x 2 + a 2 = 
= b/a. Then we obtain from equation (10) 

x 2 + a 2 = (1 “ a) x 2 + b + (1 — a)a 2 

Discarding the term (1 - a)oL 2 in the right-hand side of this equation 
we obtain the approximate equation 

x 2 *T °c 2 ~ (1 — a)x 2 + b 


2-301 
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Hence, we may choose as the next approximation 
x 3 — (1 — a)x 2 + b 

By similar reasoning we arrive at the next approximation 
x 4 — (1 — a)x 3 + b 

etc. The numbers x l% x 2 , x n , ... successively computed from the 
formula 


X „+1 =(1 -a)x n + b (12) 

approach the number b/a. But this formula makes use only of the ope¬ 
rations of addition, subtraction and multiplication, and this means 
that the computer may use it for calculating. 

The division method described above is actually based on the for¬ 
mula for the sum of an infinite decreasing geometric progression. In¬ 
deed, writing the fraction b/a in the form 

b _ b 
a 1 — (1 — a) 

we obtain with the above-mentioned formula 

-—L— = b + b(\-a)+b(\-af+ .. 

+ b(l-ay-'+ ... (13) 

Denote the sum of the first n terms of this progression by x n , 
x„ = b + b(l — a) + ... + b(l — a)" ~ 1 

Obviously 

x„+1 = b + b(l — a) + ... + b( 1 — of — 

= b + (l — a)[fc + fc(l — a)+ ... 

... + b(l — a) n ~ A ] = 6 + (1 - a)x n 

This formula coincides with formula (12). Hence, by substituting the 
approximate value x„ for the fraction b/a we substitute the sum of the 
first n terms for the infinite sum in formula (13). As the number n of 
summands increases, this sum approaches the sum of the entire pro¬ 
gression (the progression (13) is a decreasing one since 1/2 ^ a < 1 and 
so 0 < 1 — a ^ 1/2). 
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5. Extraction of Square Roots by Method 
of Successive Approximations 

Let's demonstrate now how the method of successive approxima¬ 
tions is applied to the extraction of square roots. At school a method 
is learned which enables the decimal digits of a square root to be 
found one after the other. It, too, may be regarded as being a method 
of successively approximating the answer However, this method is 
rather complicated, and students often use it mechanically not fully 
understanding how it works. We shall describe another method which 
was in use in ancient Babylon. It was also used by the Greek geometer 
Hero (Heron) of Alexandria. Subsequently this method was forgotten, 
but now it is sometimes used in electronic computers to extract square 
roots. 

Suppose, for instance, we have to extract the square root of the 
number 28. At first choose some approximate value of this root, for in¬ 
stance, put Xj = 5 We shall denote the error of this approximate value 
by oq, i.e. we shall put j/28 = 5 + oq. To obtain oq we take the square 
of both sides of the equation and obtain 

28 = 25 + lOoq + a? 

i e. 

af + 10a! — 3 = 0 (14) 

Thus, we have obtained a quadratic equation for cl v If we try to solve 
this equation exactly, we obtain a = — 5 +j/28. Hence, to find 
accurately we must compute j/28. It seems that we have found 

ourselves in a vicious circle: to find j/28 we must compute 
a! and to compute oq we must calculate j/28. 

The following reasoning comes to our rescue. The error oq in the 
approximate value x { = 5 is not large, certainly less than unity. The 
number a] is still less. Therefore we shall try to find oq discarding the 
small term af in equation (14). Then we shall obtain for oq the approx¬ 
imate equation 10 oq — 3 % 0 whence a, « 0.3. 

Thus, we have found the approximate value of the correction a. 
Since j/28 = 5 + a u the second approximation x 2 for j/28 takes 
the form 

x 2 = 5 + 0.3 = 5,3 

To obtain a still more accurate approximation for j/28 let us 
repeat the above process, i. e. let us denote the error in the value x 2 = 
= 5.3 by a 2 putting j/28 = x, + a 2 Taking the squares of both sides 
of this equality and discarding the small term otf we get 28 « x 2 + 
+ 2x 2 oq and therefore 


2 * 
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a 2 ^ 


28 — x\ 

2*2 

This means that the formula for the third approximation for )/28 is 


*3 =*2 + 


28 — x\ 
2*2 


28 4* x\ 
2*2 


Since x 2 = 5.3 we obtain from here x 3 = 5.2915... In the same way 
starting with the approximate value x 3 = 5.2915, we obtain the next 
approximation x 4 expressed by the formula 


x 


4 “ 


28 _+x[ 

2*3 


= 5 2915 


Generally, if we have already found the approximation x„ for |/28 the 
next approximation for it is 


x 


n + l 


28+ v 2 

n 

2x n 


(15) 


Thereby every new step in the process gives us ever more accurate 
approximations for |/28 The computation process stops when the dif¬ 
ference between x n+1 and x„ becomes less than the specified compu¬ 
tation accuracy. For instance, if we have to compute j/28 with an 
accuracy up to 0.0001, four approximations are enough and we may 

put 1/28 = 5 2915 (indeed, x, = 5.2915... and x 4 = 5.2915...) 

The same method may be used to extract the square root of any 
other positive number. Thus, when computing ]/a we choose some in¬ 
itial approximate value x x and then compute the next approximations 
with the aid of the formula 


x 


n + 


a + x 2 

n 


2x 


n 


(16) 


Formula (16) may be derived by reasoning somewhat different 
than that used in extracting the roof of j/28. Suppose we have 
already found the n -th approximation x„ for J /a. Since \fa = 


f— a 

it follows that Va is the geometric mean of the numbers x„ and —. 

x m 

We shall take the approximate value of this geometric mean as the 
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arithmetic mean of the numbers x„ and —, i.e. we shall put 

x„ 


x 


«+1 



*i?+° 

2x n 


This is just formula (16). 

Hence, the method of approximate extraction of square roots de¬ 
scribed above consists in substituting the arithmetic mean for the geo¬ 
metric mean of the numbers x n and — at every step. 

*« 

Let us now discuss whether or not the process of successive 
approximations as applied to the extraction of square roots always 
leads to an answer, i. e. whether the situation is always the same as in 
the case of Achilles and the tortoise, or whether it is sometimes as in 
the case of Achilles running after the antelope (mathematicians say 
that in the first case the process converges and in the second case 
diverges). We shall prove that the process of extracting square roots is 
never fraught with complications — it is always a convergent process 
and it always leads to the desired result. 

To this end let us compare the errors a n = J fa — x n and a n+ { = 

= J fa — x„+ i of two successive approximations. The error a n+1 may 
in accordance with formula (16) be written in the form 


i r \/~ x » + a 

: ya-x n+l = 1 /a -—— 


- 2xj/a + a 


2x. 


But 

x 2 - 2x„|/a + a = (x„ - j/a) 2 = a 2 

and therefore 



(17) 


We consider only positive approximate values x n of | fa. Therefore we 
may draw the conclusion from equality (17) that all the errors a 2 , a 3 , 
a„,... are negative. In other words, all approximations starting with 
the second one are excessive approximations *; the first approximation 
X! may be either excessive, or deficient. 


*The explanation is that the arithmetic mean is always greater than the 
geometric mean. 
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With the aid of formula (17) it may easily be proved that the abso¬ 
lute value of the error in the approximate value x n decreases at least 
twice with every step. Indeed, equality (17) may be written down in the 
form 


0 £ 


n + 1 



Therefore 




But since x„ > 0 it follows that 



(18) 


1 _ J /a 1 

t - 2x7 < t 

On the other hand, as was shown above, for n > 2 we have x„ > ya 
and therefore 


1 

2 



>0 


This leads to the inequality 

1 1 

2 2x 2 

n 


(19) 


Comparing relations (18) and (19) we see that 

l<*„+il <yKI 

This proves the truth of our statement: with every step the absolute 
value of the error decreases to less than half its previous value. This 
means that after the second approximation step the error will decrease 
in absolute value to less than one quarter of its original value, after the 
third — to less than one eighth, etc. Qearly, as n increases, the abso¬ 
lute value of the error a = 1 fa — x will decrease and tend to zero. But 
this just means that the numbers x„ tend to | /a as n increases. 

Let us now discuss how the choice of the initial approximation Xj 
affects the approximation process. To begin with note that this choice 
has no effect whatsoever on the final result for we have already proved 
that no matter what initial approximation x t was chosen, the errors 
a 2 ,..a„,... of the subsequent approximations tend to zero as n -► oo. 
Hence, if the necessary computation accuracy is specified, the same 
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value of ]/a within that accuracy will be obtained for all initial 
approximations x^ Even if the choice of the initial approximation is 
made very badly, we shall eventually arrive at the correct result. Aftei 
ten approximation steps the absolute value of the error will decrease 
at least a thousand times (2 10 = 1024 * 1000) and after forty — at least 

a billion (10 12 ) times. Thus, if when computing \/l we put x^ = 10 6 so 
that otj » 10 6 , then |a 40 | < 10 " 6 . In other words, in the beginning of 
the process the error was about a million, and at the end its absolute 
value became less than one millionth. 

Nevertheless, the choice of the initial approximation affects the 
length of the approximation process. If the initial approximation is 
unfortunate, one has to wait a long time before the difference between 
x n+1 and x n becomes less than the specified computation accuracy. 
A good choice of the initial approximation speeds up the process. 
Hence it is often the practice to take the initial approximation from 
the tables of square roots and to use the formula 


x 


2 


a + x\ 
2x x 


( 20 ) 


only to obtain a more precise value. 

This method is especially convenient because the rate of decrease in 

the error is appreciably higher as x n approaches | fa. This is because in 
deriving the inequality 



kl 


we have substituted the number 1/2 for the factor 





r 1 l/a 

mula (18). However, if x„ is close to 1/a, the fraction-is very 

_ 2 2x„ 

small and therefore \o^ +i 

We can say this more precisely. To do so consider together with the 
absolute error |otJ = |]/a — x„|, the relative error P„ of the approxi¬ 
mate value Xffj i. e. the ratio of the absolute error |oc„| to the exact value 
of the root | fa. This error is expressed by the formula 


ya 

2x„ 


a n is much less than (cxj. 
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The following formula for the quantity P„ +l may be obtained from 
equation (17): 


R k+il _ hi 2 

rn +1 r~ — r~ 

]fa 2 x n \Ta 

Since x„ > J fa it follows that 


P 


n+ 1 


hi 

\]/a)- 



Thus, the relative errors P„ satisfy the inequality 


(2D 

For instance, if the relative error of the approximation x n is 0.01 it 
does not exceed 0.00005 for x n+1 and 0.000000000 13 for x„ +2 . We 
see that accuracy of the approximations improves at an ever 
increasing rate. It may be demonstrated that when we are quite close 
to ]fa every successive approximation doubles the number of correct 
significant digits. 

Example. Compute ]/238 with an accuracy of 0.00001. 

From a table of square roots we find j/238 = 15.43. Put Xj = 15.43 
and find x 2 using the formula 


15 43 2 + 238 
3086 


15.42725. 


Assess the accuracy of the answer obtained. Since the error of the 
value 15.43 does not exceed 0.01, a^O.Ol and therefore 


0.01 

p, ^-<0.001 

Pl 15.43 


But in this case 


0.001 2 

p 2 <-= 0.0000005 

2 2 

This means that the absolute error of the approximation x 2 does not 
exceed the value 15.43 x 0.0000005 < 0.00001. In other words, all 
seven digits of the value j/238 = 15.42725 are correct. 

If we wanted to have fourteen correct digits we could obtain the 
necessary result from just the third approximation. However, the need 
for such accuracy is very rare. 
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Let us finally mention the following peculiarity of the method of 
successive approximations When the usual method of extracting 
square roots is employed, an error made at any stage completely inva¬ 
lidates all subsequent computations. The situation is different when 
the method of successive approximations is used. Suppose that as 
a result of an error we obtained a wrong value y„ of the n -th approxi¬ 
mation instead of the right value x„. In this case all the subsequent 
computations may be regarded as computations of ]/a with the initial 
approximation y„. But we have already seen above that the method of 
successive approximations leads us to the correct value of |fa to the 
required accuracy no matter what initial value was chosen. Hence, the 
error we made will eventually tend to zero. The only effect it will have 
is to force us to take a few extra approximation steps. 

Because of this peculiarity of the method of successive approxima¬ 
tions the computations may be started with low accuracy, the speci¬ 
fied accuracy being employed only for the final approximations. This 
shortens the time needed for the computations. 


6, Extraction of Roots with Positive Integer Indices 
Using Method of Successive Approximations 

The method of extracting square roots described above may be 
used for extracting roots with any positive integer index, as well. For 
this purpose we shall need the formula * 

(x 4- oif = x k + kx k ~ l <x 4- ... (22) 

where the dots denote terms containing a 2 , a 3 , etc. 

Let us prove this formula. It is known from the school mathe¬ 
matics course that 

(x + a) 2 = x 2 + 2xa + a 2 

(x + a) 3 = x 3 + 3x 2 a + 3xoe 2 + a 3 . 

These equations may be re-written in the following form: 

(x + a) 2 = x 2 + 2xa+ ... (23) 

(x + a) 3 = x 3 + 3x 2 a + ... (24) 

*This formula follows from the binomial theorem, but we do not expect 
the reader to be acquainted with this theorem 
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Hence, the formula (22) has been proved for k — 2 and k = 3. Multiply 
now both sides of formula (24) by (x + a). We shall obtain that 

(x 4- a) 4 = (x 3 + 3x 2 a+ ...)(x 4- a) 

If we remove the brackets in this equation we obtain one term x 4 not 
containing a, and two terms 3x 3 oc and x 3 a containing a to the first 
power; the other terms containing a to the second and higher powers. 
Therefore we may write 

(x + a) 4 = x 4 + 3x 3 a+ x 3 a+ ... =x 4 + 4x 3 a+ ...(25) 

(as before, the dots denote terms containing a 2 , a 3 , etc.). 

Thus, formula (22) has been proved for k = 4, as well. In the same 
way formula (25) yields 

(x 4- a) 5 = x 5 + 5x 4 a 4- ... (26) 

Obviously, in the same way we may prove formula (22) for any posit¬ 
ive integer exponent k. 

Let us now return to the extraction of a k- th root, where k is any 
whole number. Suppose that some approximation for the sought 

root j/a has been found. Denote the error of this approximation by oq, 
i.e. suppose that x t + cl 1 =\fa. Then (Xj 4 oq)* =a But using for¬ 
mula (22) we may write this equation in the following form: 
x\ + /cx*! -1 ^ 4- ... =a 

where the dots denote terms containing a 2 , a 3 , etc. k 
If the approximation Xj chosen was close enough to J/a, the error 
cl 1 of this approximation will be small and we will be able to neglect 
terms containing higher powers of the error. Hence, we obtain the fol¬ 
lowing approximate equality: 

x^j + kx\~ l OL { ^ a 

It follows from this equality that 


and for this reason we may take as the next approximation for j/a the 
number 

a — x\ a + (k — ljx^ 

Xl=x ' + ~btr r= k xv 7 * 

In the same way, using the approximation x 2 , we may find the next 
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approximation 


a + {k — 1 )^2 

* 3= 

In general, if the approximation x„ for j fa has been found, the 
next approximation will be given by the formula 


x 


n + l 


a 


+ (fe->K 

/cx* " 1 

n 


(27) 


As was the case with the extraction of square roots, it may be shown 
that the above process converges for any initial approximation x t 
(provided this approximation is a positive number). In other words, 

for any x { chosen, the numbers x 1? x 2 , x„, ... tend to j fa. The 

approximation process is continued until the numbers x n and x„ +1 
coincide within the accuracy required. 

Example. Find the value of J/970 to an accuracy of 0.001. For k = 
= 3 the approximation formula (27) assumes the form 


a + 2x >} 3 
3x 2 


(28) 


In our case a = 970. Put x i = 10. It follows from formula (28) that 
970 + 2 x 10 3 2970 


*2 = 


*3 = - 


3 X 10 2 300 

970 + 2 x 9.9 3 2910.60 


= 9.900 


3 x 9.9 2 


294.03 


= 9.899 


We see that the values of x 2 and x 3 coincide within the accuracy speci¬ 
fied. Therefore we have with an accuracy of 0.001 

f/970 = 9.899 
7. Method of Iteration 

All the examples considered above are specific cases of a single 
general method of solving equations. This method is called the method 
of iteration, or the method of successive approximations. The essence of 
this method is as follows. 

The equation f(x) = 0 which is to be solved is rewritten in the form 

X = cp(x) (29) 
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Then an initial approximation x t is chosen and substituted into the 
right-hand side of (29). The value x 2 = cp(x x ) so obtained is taken as 
the second approximation for the root. In general, if the approxima¬ 
tion x„ has been found, the next approximation x n+l is obtained from 
the formula 

X„+1 =<pW 

Suppose that after several approximations the equality x n ^ x n+ x is 
satisfied within the specified accuracy. Since x n+ j = (p(x„) this means 
that the equation x„ ^ <p(x„) is also satisfied within that accuracy, i.e 
that x„ is the approximate value of the root of the equation x = cp(x). 

For instance, in solving the problem of Achilles and the tortoise we 
wrote the equation 

lOx — x = 1000 

in the form 


x = 100 + — 

10 

and looked for approximations in the form 


*„ + i = 100 + 


10 


In the problem concerning division on an electronic computer we 
wrote down the equation 

ax = b 


in the form 


X = (1 — <*)x + b 

and looked for approximations given by the formula 

*„ + i = (l -“K + b 

Finally, when extracting the k -th root we transformed the equation 
into 


a + (fc — l)x* 

after which we looked for approximations using the formula 


a + (k- l)x‘ 

k^r 1 
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Here is an example of a more complex equation which can be 
solved by the method of iterations. 

Example. Solve the equation 

lOx — 1 — cos x = 0 (30) 

with an accuracy of 0.001. 

Rewrite equation (30) in 1 the following form: 


1 + cos x 

x =- 

10 


(31) 


Choose some initial approximation, for instance x x =0, and substi¬ 
tute it into the right-hand side of equation (31). The value obtained 


will serve as the second approximation for the root sought. Substitut¬ 
ing the value of x 2 into the right-hand side of equation (31) we obtain 
the third approximation: 


Next we find 


*3 


1 + cos 0.2 1 + 0.98 

10 10 


0.198. 


1 + cos 0.198 
10 


.0 198 


We see that the equality x 3 = x 4 is satisfied with an accuracy of 
1 + cos x 3 

0.001. Since x 4 =---this means that, to an accuracy of 0.001, 

1 + cos X 

the number x 3 =0.198 is the root of the equation x=-“-. 

Several questions arise in connection with the method of iterations: 

1. Does the sequence x u x,,, ... obtained by the iterative method 
always converge to some number £? 

2. If the equality lim x„ = £ is true, is the number £ a solution of the 

n-* oo 

equation x = <p (x)? 

3. How rapidly do the numbers x t , ..., x m ... approach the root of 
the equation x = <p(x)? 

The second question is the easiest to answer. Suppose the numbers 
x„ ...» x„, ... approach the number Consider the equality x M+1 = 
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= cp (x„) which expresses the next approximation in terms of the pre¬ 
ceding one. As n increases, its left-hand side approaches E, and the 
right-hand side approaches <p(Q*. Hence we obtain in the limits = 
= (p(y, i.e. £, is the root of the equation x = (p(x) 

The answer to the first question is in the negative. Indeed, consider, 
for instance, the equation 

x - 10 x - 2 

If we put Xj = 1 we obtain 

x 2 = 8, x 3 = 10 8 — 2, ... 

As n increases, the numbers x„ also increase, but do not tend to any 
limit. On the other hand, if we rewrite the equation in the form x = 
= log (x + 2), the approximation process will converge and we obtain 
after three approximations x = 2.38. 

Therefore instead of the first question we should ask the following 
one: 

What form of the function <p(x) guarantees the convergence of the 
sequence of numbers x lt x 2 .x„, ... 7 

Before dealing with this question we shall discuss the geometrical 
interpretation of the method of iterations. 


8. Geometrical Meaning of Method of Iterations 

Clearly, finding the root £ of the equation x = cp(x) is just the same 
as finding the abscissa of the point M of intersection of the curve y = 
= (p(x) with the straight line y — x. Suppose we have some initial 
value x t (Fig. 2). In this case the point M x with the coordinates 
M l (x u (p(x!)) lies on the curve y = cp(x). Draw a horizontal line 
through this point. It will intersect the straight line y = x in the point 
Ni(<p(xi) t (p^xj). Denote ^(xj by x 2 . Then the coordinates of the 
point Nj will be of the form N l (x 2 , x 2 ). Next draw a vertical line 
through the point N v It will intersect the curve y = (p(x) in the point 
M 2 with the coordinates Af 2 (x 2 , q>(x 2 )). Repeating the process, we 
obtain the point N 2 on the straight line y = x with the coordinates 
N 2 (x 3 ,* 3 ) where x 3 = cp(x 2 ), then the point Af 3 on the curve y = (p(x) 
with the coordinates M 3 (x 3 , cp(x 3 )), etc. If the approximation process 
converges, the points M u Af 2 ,..., Af„,... will approach the point of in¬ 
tersection sought. 


*We suppose <p(x) to be a continuous function 
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Hence, the geometrical meaning of the method of successive 
approximations is that we move towards the required point of inter¬ 
section of the curve and the straight line along a broken line whose 
vertices lie in turn on the curve and on the straight line and whose sec¬ 
tions are in turn horizontal and vertical (Fig. 2a). 



tx 4 x 3 x2 x, x 'I x, x 3 £x 4 x 2 x 

(a) (b) 

Fig 2 


If the curve and the straight line are situated as shown in Fig. 2a , 
then this broken line looks like a ladder. If, on the other hand, the 
curve and the straight line are as shown in Fig. 2b, then the broken 
line looks like a spiral. 



(a) (b) 


Fig 3 

The process of successive approximations described above may 
diverge without leading to any result (as was so in the case of the pro¬ 
blem of Achilles and the antelope). Graphically it means that the steps 
of the ladder (or the spiral) become larger and larger and because of 
this the points M l9 M n , ... instead of approaching the point 
M move away from it (Fig. 3). 
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The difference between Figs. 2 and 3 lies in the following. Draw 
a straight line inclined at 135° to the x-axis through the point Af of in¬ 
tersection of the straight line y = x and the curve y = (p(x). This 
straight line, together with the line y = x, will divide the plane into 
four quadrants. If the curve in the neighbourhood of the point Af lies 
in the left and the right quadrants of the plane and if the initial 
approximation is taken in this neighbourhood, then the iteration pro¬ 
cess converges. If, on the other hand, the curve lies in the upper and 
the lower quadrants of the plane, the process will be divergent. 

However, to use this rule one has first to sketch the graph of the 
function y ~ <p(x), but this is not always expedient. So another conver¬ 
gence test has to be devised for the iteration process which would 
enable the convergence (or divergence) to be established analytically 
without any geometric constructions. This test will be discussed in 
Sec. 10. But first we should become acquainted with the concept of the 
contraction mapping. 


9. Contraction Mappings (Contractions) 

Consider the function y = <p(x) defined on the interval [a, b]. So for 
every point x 0 of this interval, there is a corresponding point y 0 on the 
y-axis, namely the point y 0 = cp(x 0 ). To plot this point one has to draw 
a vertical line through the point x 0 of the x-axis until it intersects the 



graph of the function y = (p(x) and then to draw a horizontal line 
through the point of intersection until it intersects the y-axis (Fig. 4). 
Thus, the function y = ip(x) gives a mapping or map of the interval 
[a, b~\ to the y-axis. The set of all points on the y-axis corresponding 
to points of interval [ a , b] is called the image of the interval. For in¬ 
stance, the image of the interval [2, 5] under the mapping y = x 2 is the 
interval [4, 25], and the image of the interval [ - 1, 6] under the same 
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mapping is the interval [0, 36] (draw the graph of the function y = x 2 ). 
It may be proved that if the function y = cp(x) is continuous on the in¬ 
terval [a, 6], then the image of this interval will also be an interval on 
the y-axis. If the function y = <p(x) is also a monotonic increasing func¬ 
tion, the image of the interval [a, 6] is the interval [(p(a), <f>(6)], while if 
it is a monotonic decreasing function, the image is the interval [(p(6). 
cp(a)] (Fig. 5). 




Instead of considering the mapping of the interval [a, 6] to the 
y-axis one may consider its mapping to the x-axis. To do this, after 
mapping the interval to the y-axis, rotate the y-axis clock-wise 
through 90°. As a result, the points of the interval [a, 6] will first be 
mapped to points on the y-axis and then to points on the x-axis. In 
this way the function cp(x) gives a mapping of the interval [a, 6] to the 
x-axis. We shall denote this mapping as follows: x -> (p(x). If the func¬ 
tion (p(x) is continuous, we obtain as a result an interval on the x-axis. 

It may happen that the image b { ~\ of the interval [ a, 6] turns 
out to be a part of [a, b]. For instance, under the mapping y = ]/x + 1 
the interval [0,4] maps to a part of this interval, the interval [1, 3]. In 
such cases we shall speak of cp(x) mapping the interval [a, b ] to 
a subinterval. If<p(x) maps the interval [a, 6] to a subinterval [a v 
then any subinterval [ a, 6] will map to a subinterval of [a u fej]. In 
particular, the interval [a„ b { ~\ will itself be mapped by (p(x) to its 
subinterval [a 2 , b 2 \ In the same way the interval [n 2 , 6 2 ] maps to 
subinterval [a 3 , h 3 ] under the same mapping, and so on. As a result, 
we obtain a system of intervals: 

[a, ft], J> t , bj, [fl m b„], ... 

each of which is a subinterval of the preceding interval and such that 
[a n+1 , b n+l ] is the image of [a„, 6J under the mapping (p(x). 


M)\ 
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For instance, the mapping x-+ 1--j-y takes the interval [0,4] 

to its subinterval [1/2, 5/6]. Applying this mapping to the interval 
[1/2, 5/6] we obtain the interval [3/5, 11/17], etc. Every successive in¬ 
terval is included in the preceding one. 

Two cases are possible: either there is an interval [c, d] common to 
all intervals [a# b„], or the intervals have only one common point In 
the latter case the system of intervals [ a „, b n ] is said to contract to 
a point 

Below we shall formulate conditions for the system of intervals [ a , 
fe], [flu fej],[ a w 6J, ... to contract to a point. For this let us intro¬ 
duce the important concept of a contraction. The mapping (p{x) which 
takes interval [a, b] to its subinterval \_a u fcj] is called a contraction if 
it decreases the distance between any two points of this interval at 
least M times where M > 1. Since the distance between x 2 and x x is 
|x 2 — Xj|, the condition may be formulated as follows. 

A mapping is a contraction on the interval [a, b] if there is a number q t 
where 0 < q < 1, such that for any two points x 1? x 2 belonging to the in¬ 
terval [a, b] the inequality 

|(p(x 2 )-(p(xi)| <q|x 2 -Xi| (32) 

is satisfied (here q — 1/M). 

The length of an arbitrary subinterval [ c, d ] of the interval [a, b~\ is 
decreased by a contraction mapping cp(x) at least M = \/q times. In¬ 
deed, let [c 1# df\ be the image of the interval [c, d]. Then c x and d x are 
the images of some points Xj and x 2 of the interval [c, d]: 

C!=(p(x,), d 1= (p(x 2 ) 

But then 

Ml - C|| « M*2) - <p(*l)| < q\*2 - ^l| 

Since the points x : and x 2 lie in the interval [c, d], the distance 
between them, |x 2 — xj, is less than the length |d — c\ of the interval 
[c, d]. Therefore 

Mi - cj ^q\d-c\ 

Our statement has been proved. 

Now we can formulate a condition for the system of intervals 
[a 1} hj],[ a„, b n ],..., obtained from interval [a, b] by successive use 
of mapping (p(x), to contract to a point. 



If the mapping (p(x) which takes the interval [ a , b ] to its subinterval 
[a u bf\ is a contraction, then the system of intervals [ a lt b t ], ... 
[a n , bn], ... will collapse to a point 4 belonging to the interval [a, b]. 
Indeed, since the mapping cp(x) is a contraction, for any n 

\b„ — fl n| ^ 1 — a n- l| 

In the same way 

\b„-i — 0 »- 1 | ^q\ b n-2-a n - 2 \ 

But then 

\b„~a n \ ^q 2 \b n - 2 -a n _ 2 \ 

Repeating this reasoning we obtain 

\K - a„| < q tt \b - a\ 

Since 0 < q < 1, the sequence of numbers q, q 2 , ... tends to zero 

and so the lengths | b„ — a„\ of the intervals [a„ b n ] tend to zero as 
n tends to infinity. Hence there can be no interval [c, d] which is 
a subinterval of all the intervals [< a m bf\. Therefore the system of 
intervals 

[a, b J, b ij, . [a m bf], ... 

contracts to a point. 

Finally, let us consider mappings (p(x) for which inequality (32), 
|(p(x 2 )-(p(xi)| <q|x 2 -x 1 | 

is satisfied for any pair of numbers x 2 and x x . Such mappings are con¬ 
tractions on the whole number axis. Let us demonstrate that in this 
case there is an interval which contracts under the mapping (p(x). 
Since the condition (32) is satisfied for any two points Xj and x 2 , it suf¬ 
fices to show that there is an interval which (p{x) maps into itself. Take 
an arbitrary number a and put b = <p(a). Choose the number q x < 1 so 
that q <q { . 

Let us put 



1 -<?i 


We shall show that the interval [a - R, a + R] is taken by the map¬ 
ping (p(x) into a subinterval. Indeed, let x be a point of this interval. 
Then |x - a\ < R. By virtue of inequality (32) we conclude that 

|<p(x) -b\ = |<p(x) - <p(a)| < q\x -a\^ qR. 


3 * 
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But then 

|(p(x) — a\ = |(p(x) - b + b - a\ ^ |<p(x) — b\ + \b - a\ < 

^ qR + \b — a\ = qR + (1 - q x )R = (1 + q — < R 

This demonstrates that any point of the interval [a - R, a + R] is 
taken by the mapping <p(x) to a point of the same interval and that, 
consequently, the mapping cp(x) contracts the interval [a — R, a + R]. 


10. Contraction Mappings and Method of Iteration 

Let us now return to the method of iteration. This method is used in 
solving equations of the type x = (p(x). If 4 is a root of this equation, 
then £ = (p(y, and the mapping x -► cp(x) leaves the point E, fixed where 
it is. Hence, the problem of solving the equation x = cp(x) is equivalent to 
the problem of finding the fixed points of the mapping (p(x). 

If the mapping <p(x) is a contraction on the interval [a, f>], there is 
always a fixed point in this interval. To convince ourselves of this let 
us take a set of intervals 

[ai, &,], [a 2 , b 2 \ [a m b„], ... 

obtained from [ a, 6] by successive use of the mapping (p(x). Since cp(x) 
is a contracton mapping on the interval [a, fe], there is a unique point 
E, common to all the intervals [a„ bf]. This is a fixed point of the map- 
ping cp(x). 

Indeed, the mapping cp(x) takes every interval [ a m bf\ to a subinter¬ 
val [a n+ j, b„+ ,]. Therefore, the image (p(x) of any point x of the inter¬ 
val [ a n , b„] lies in the subinterval [a n+l , b n+l ] and so certainly inside 
[a m b n ]. Since the point ^ belongs to all the intervals [ a„, h„], its image 
(p(y must also belong to all these intervals. But the only point which 
belongs to all the intervals [a„ b n ] is the point E>. Therefore, (p(y = £, 
i.e. 4 is a fi^d point of the mapping (p(x). 

Thus, for contraction mappings on the interval [ a , b ] there is always 
a fixed point lying in the interval. This point is unique. Indeed, should 
there be another fixed point q, q = <p(q), the inequality 

|ri - ^| = |<p(ri) — <p(^)| < ^|t| - E.| 

would hold. Since 0 < q < 1, this inequality can be satisfied only if 
|r| - = 0, i. e if 11 = 

Now we can formulate a necessary condition for the convergence of 
the iteration process. 
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Suppose the function (p(x) yields a contraction mapping on the interval 
[a, b]. Then for anv point x 0 belonging to this interval the sequence of 
numbers x 0 , Xj, x 2 , .x fl , .. where v n+ , = ip(x f) ) converges to a rool 
4 of the equation x — (p(x) which lies in this interval. 

Indeed, let \a n , b„], n — 1, 2, be a sequence of intervals obtained 
from [a, b] by sequential applications of the mapping tp(x). Since the 
point x 0 lies in the interval [ a, b], its image x { = (p(x 0 ) lies in the inter¬ 
val [a„ bj, the image x 2 = (p(xj) of the point xj lies in the interval 
[i a 2 , b 2 ], and so on. Thus, for any n the point x n lies in the interval 
[i a n , bn]. Since the lengths of the intervals [ a n , b n ] approach zero as n 
increases, the sequence of points x x .x„, ... approaches the com¬ 

mon point £, of these intervals. 

The above reasoning shows that any point x 0 of the interval [ a , b] 
may be taken as the initial point. 

Let us now find the rate at which the points x 0 , x P ...» x„, ... 
approach the point Since c, = ip(cj, we have for any point c of the 
interval [a, b] 

|cp(c)-^| = |(p(c)-(p(^)|<^|c-^| (33) 

Apply inequality (33) to the points x 0 , x m ... . Since x r = 
= cp(x n _ 1 ), it follows that 

|~ S| = M*n- l) ~ ^| < q\*n-l ~ 

But then for any n we have 

<<z|*n-l ~S| < 2 — ^1 < ...<q H \x 0 -C,\ 


Hence, the error |jc„ — decreases with increasing n at least as fast as 
a geometric progression having ratio q. 

Let us give some examples of how the condition proved above may 
be applied. 

Example 1 . Can the iteration method be applied to the solution of 
the equation 


1 

4 + x 2 


(34) 


In this case 


<p(x) = 


1 

4 + x 2 
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For arbitrary x 1 and x 2 , we have 


|<p(x 2 )-cp(x 1 )| 


1 1 
4+x 2 2 4 + xJ 


— xj| = | X | + x 2 | . _ , 

(4 + x*X4 + x?) |4 + x?||4 + x|| |Xj 

Using the inequality between the geometric mean and the arithmetic 
mean we obtain 


1 /-r 4 + X 2 

= —F4x 2 <- 

2 v 4 


Therefore 


|Xi + X 2 | ^ |Xj| + |x 2 | < 


(4 + xj) + (4 + xf) 


= 2 + 


x\ + x\ 


^2+- 


*7 + 


r 2 v 2 

x \ x 2 


= -|(4 + xf)(4 + xi) 

We have proved that for any x : and x 2 the inequality 


x, + x 2 


1 

(4 + x 2 )(4 + x 2 ) 


holds and that therefore 


|<P(x 2 ) - <p(x,)| |x 2 -x,| 

This means that the mapping cp(x) is a contraction on the entire axis. 

We know already that in this case there is an interval which is 
mapped by this contraction into itself. To find it put a = 0. The map¬ 
ping (p(x) takes the point a = 0 to the point b- 1/4. Furthermore, in 


our case q— 1/8. Put q { = 1/4 and denote the number 

1 1 

T "3 


| b — a 


= iby 


R. The interval 


1 -<7i 

is mapped by cp(x) into itself. Conse¬ 


quently, there is one fixed point in this interval, which is a root of 
equation (34). To find this point take an arbitrary point of the interval 

- —, — , for instance the point x 0 = 0. Using the method of ite¬ 
rations we obtain 
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X, =1 = 0.25 


*2 = 

*3 = 

X 4 = 


1 

1 

4 + 0.25 2 “ 

4.0625“ 

1 

1 

4 + 0.2461 2 

“ 4.0605 

1 

1 

4+0.2463 2 

“ 4.0605 


= 0.2463 

= 0.2463 


So within an accuracy of 0.0001 we have x 3 = x 4 . It follows that, with 
an accuracy of 0.0001, the root of equation (34) lying in the interval 

— — “ is 0.2463. Since the mapping (p(x) is a contraction on the 

whole axis, equation (34) has no other roots. 

Example 2. Can the method of successive approximations be used 
to solve the equation 

x = l + {/^ 

in the interval [ - 1, 8]? 

Here (p(x) = 1 + j/x. Since cp( — 1) = 0 and (p(8) = 3, (p(x) maps the 
interval [ - 1, 8] into itself. However, on this interval it is not a con¬ 
traction since, for instance, if Xj = —0.008, x 2 = 0.008 


|tp(x 2 ) - (p(xi)| = 


|/0.008 -]/ - 0.008 


= 0.4 > |x 2 -x y 


In proving that the mapping of Example 1 was a contraction we used the 

inequality \/ab < —. Now we shall introduce some inequalities which 

often have to be used to prove that some mapping is a contraction. 
Prove that for x > 0 the inequality 

sin x < x < tan x (35) 

holds. To do this note that the area S 0AB (Fig. 6) of the sector OAB with cen¬ 
tral angle x lies between the areas of the triangles OAB and OAT 

SaOAB < Ssect OAB < SAOAT 
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But 


R 2 sin x K 2 tanx 

Saoab = -2 —» Saoat= -2- 

R 2 x 

(R is the radius of the circle) The area of the sector OAB is —— (the angle is 
measured in radians). Therefore 

K 2 sin x R 2 x R 2 tanx 

- < - < - 


Cancelling R 2 /2 we obtain inequality (35). From inequality (35) it follows that 
for 0 < x < 1 we have 


and for x > 0 


x < arcsin x 
x > arctan x 


Note also the inequalities 


e x > 1 + x, x > 0, 
ln(l + x) < x, 0 < x < 1 

which are somewhat more difficult to prove 
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Example 3. Find out whether the equation 

x = 1 H— arctan x (36) 

2 

can be solved by the method of iterations. 

Since for all values of x we have 1 + arctan x > 0, the equation 

can have only positive roots. We have 

1 

(p(x) = 1 +y arctan x (37) 
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Therefore 


I<p(* 2 )-<p(*i)| = 


1 + yarctan x, ) - 


— 1 + -arctan x. 
1 2 


: yjarctan x 2 — arctan x { 


But for x x ^ 0, x 2 ^ 0 


x 2 ~ x i 

arctan x 2 - arctan x i = arctan- 

1 + x, x 2 

and therefore 


|(p(x 2 )-(p(Xi)| 


x 2 — x 

arctan- 

1 + x, x 


< 


It follows that the mapping is a contraction on the semiaxis [0, oo). It 

maps the interval [0, |/3] on its subinterval [1, 1 + tc/ 6]. Therefore 
there is unique root of equation (36) which lies in the interval 
[1,1 +7i/6]. To find this root put Xi = 1. Then 

x 2 = 1 + ^arctan 1 = 1 + ^ « 1.39 


x 3 = 1 +y arctan 1.39 = 1.474 
x 4 = 1 -Ty arctan 1.474 = 1.487 
x 5 = 1 +y arctan 1.487 = 1.489 
X(y = 1 +y arctan 1.489 = 1.490 
x 7 = 1 +y arctan 1.490 = 1.490 


We see that the equality x 6 — x 7 = 1.490 is satisfied with an accur¬ 
acy of 0.001. This means that with this accuracy the root of our equa- 
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tion is 1.490. Since the mapping <p(x) is a contraction on the entire 
semiaxis 0 ^ x < oo, equation (36) has no other roots. 

If often happens that an equation x = cp(x) which cannot be solved 
by means of the method of iterations can be transformed to an equa¬ 
tion which makes the use of this method possible. Let us take, for in¬ 
stance, the equation 


x = x 3 - 2 (38) 

Since here we have 

tp(l) = -1 < 1, ip(2) = 6 > 2 

this equation has a root lying in the interval [1, 2]. But the mapping 
x 3 — 2 is not a contraction on this interval since it does not map it into 
a subinterval. Rewrite equation (38) in the form 

x = |/x + 2 

Putting v|/(x) = yx 4- 2 we have 


|vKx 2 )-v|/(xi)| 


j/x 2 +2-(/x! + 2 


_ 

V ( x 2 + 2) 2 + {/(x, + 2)(x 2 + 2) + j/(x, + 2) 2 


In the interval [1, 2] we have X! > 1, x 2 ^ 1. Therefore 


|<Kx 2 ) - <W*i)| 



-X, 


We have proved that the mapping \|/(x) is a contraction on the interval 
[1, 2]. Put X! = 1 and apply the method of iterations. 


*2 : 

= /3= 1.442 

*3 

= j/3.442 = 

1.510 

*4 

= j/3.510 = 

1.520 

*5 

= {/3.520 = 

1.521 

*6 

= f/3.521 = 

1.521 


Hence, with an accuracy of 0.001, 1.521 is the root of equation (38) 
lying inside the interval [1, 2]. The equation has no other roots. 
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We see that an appropriate transformation of the initial equation 
has reduced it to the form to which the method of iterations is 
applicable. 

The convergence test for the method of iterations described herein 
is not very convenient to use since it requires proof of rather complex 
inequalities. Below (Sec. 21) we shall consider one corollary of this test 
which makes the proof of convergence of the iteration process much 
easier 


11. Method of Chords 

The method of iterations is one of the most general methods for the 
approximate solution of equations. Many other methods of approxi¬ 
mate solution of equations are just particular cases of the method of 
iterations. Now we shall describe one of these methods, called the 
method of chords (rule of false positions). 



Suppose we have to solve the equation f(x ) = 0. This problem is 
equivalent to that of finding points in which the graph of the function 
y = f(x) intersects the x-axis. Suppose the function f(x) is continuous 
and its values at points a and b have different signs. Then for at least 
one point of interval [a, b] the function vanishes. In other words, the 
graph y = f(x) intersects the x-axis in at least one point E, of the inter¬ 
val [a, b~\. In general, there may be several such points (Fig. 7). How¬ 
ever, if the function y = f(x) is monotonic on the interval [a, b~\ and its 
values at the ends of the interval have an opposite sign, then the graph 
of this function intersects the x-axis in only one point To find this 
point approximately, substitute the chord MN for the arc of the curve 
y = f(x) on the interval [a, fr], and find the point of intersection T of 
this chord with the x-axis (Fig. 8). 

To do this consider similar tnangles MM, T and NN i T. From the 


similarity of these triangles it follows that 


A i x T 
MM, 


Wi 

N t N 


. But from 
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Fig. 8 it may be seen that M X T— a x — a, TN t = b - a u MM j = - f(a) 
and NjN = f(b ), where a x denotes the abscissa of the point of intersec¬ 
tion of the chord MN with the x-axis. Therefore 

a l - a _ b - a, 

-m = m 

Solving this equation we obtain 

= af(b)-bf(a) 

m-m 



Fig S 


This may also be written in the form: 

(39) 


(40) 

[check by reducing the right-hand sides of formulas (39) and (40) to 
a common denominator]. 

The number a x is the approximate value of the root of the equation 
/(x) = 0 lying between the points a and b . 

Since the signs of the numbers f(a) and f(b) are opposite, there are 
two possibilities: either the sign of f(a ), or the sign o ff(b) is opposite to 
that of f(a x ). If the signs of the function f(x) at points a and a x are 
opposite, formula (39) is applied to the interval [a, a t ] and the follow¬ 
ing approximation is obtained for the root sought: 


a { =b -f(b) 


b — a 


or 


a = a —f(a) 


b — a 


f(b)-f(a) 


a 2 =a i ~f( a l) 


/K )-/(*) 


(41) 


44 



If, on the other hand, the function f(x) assumes values having opposite 
signs at points a x and b , then formula (40) is applied to the interval 
[fl 1( b ] by taking 




b — a. 




(42) 


After finding the value of a 2 ., formula (39) is applied to the interval 
[ a , a 2 ] (or formula (40) is applied to the interval [ a 2 , b] as the case 
may be) and the next approximation a 3 is found. Generally, if the 
approximation a n has already been found, the next approximation is 
found from the formula 


a =a —f(a ) 

n + 1 n J ' n’ 




(43) 


or the formula 


: °n -/(«.) 




(44) 


We have obtained two formulae (43) and (44). Let us see now when 
each one should be used. Suppose the curve is concave up. In this case 
the points of the curve should be joined to whichever of its ends M or 
N, where the function is positive. If, on the other hand, the curve is 
concave down, the points should be joined to the end where the func¬ 
tion is negative. The different situations which might arise are illus¬ 
trated in Fig. 9. These drawings make the following statement geome¬ 
trically obvious: 

Suppose the function f(x) is continuous and monotonic on the inter¬ 
val [a, b], the direction of its concavity is constant and it assumes values 
of opposite sign at the ends of the interval. Then, provided the right 
approximation formula has been chosen, the method of chords gives 
a sequence of points converging to the root of the equation f(x) = 0. 

If, on the other hand, the choice of the formula is made incorrectly 
the method of chords may take the point a 2 outside the interval [a, b]. 
This is illustrated by Fig. 10. 

The method of chords just described is a particular case of the 
method of iterations. Suppose the function f(x) does not become zero 
at x = a. In this case the equation /(x) = 0 is equivalent to the 
equation 


x = x -/(x) 


x — a 

m-m 


(45) 
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Indeed, if /(£) = 0 then 




method of chords: 


a n + 1 


= a, ~f(a»\ 


a„ — a 


As an example let us solve, using the method of chords, the equation 

X 3 + 3x - 1 = 0 (47) 

Here f(x) = x 3 + 3x — 1. Since /(0) = — l,/( 1) = 3, equation (47) has 
at least one root in the interval [0, 1]. If we plot the graph of the func¬ 
tion y = x 3 + 3x — 1 we can see that on the interval [0, 1] it is concave 
up Therefore we use formula (39). According to this formula the first 
approximation for this root is the number 


b — a 

:h ~ m nh\~n 7 

f(b) - f(a) 


1-3 


1 -0 


= 0 25 


To find the second approximation we use the formula 


Then 


x 2 = b -/(/>)- 




-= i 


1 - 0.25 
'3 + 023 


= 0.31 


*3 = 


1-0.31 
3 +“6.040 


= 0.319 


x 4 = 1 
x 5 = 1 


-3 


1-0.319 


3 + 0.010 


= 0.322 


- T 


1 - 0.322 


3 + 0.0006 


= 0.322 


Hence, with an accuracy of 0 001 the root of the equation in the in¬ 
terval [0, 1] is 0.322. 


12. Improved Method of Chords 

If the method of chords converges, its rate of convergence is the 
same as that of the method of iterations — the error in the value of the 
root decreases as a geometric progression. There is a way of improv¬ 
ing the method of chords so that its rate of convergence becomes 
much greater. In the usual method of chords we use at each step one of 
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the ends of the interval [a, b ] and the last approximation obtained In¬ 
stead the last two approximations may be used, for they are closer to 
the root sought than the ends of the interval \_a, b]. 

The formula which makes use of the last two approximations is of 
the following form (Fig. 11a)* 


= a „-/K) 


/K) -/K-,) 


(48) 


Here a { is computed with the aid of formula (39) and a 2 — with the aid 
of formula (41) or (42), depending on the signs of f(a\ f(b) and /(a,) if 



f(a) < 0, f(b) > 0, then for f(a : )< 0 formula (42) is chosen while for 
f(a { )> 0 formula (41) is chosen. 

If by chance it turns out that point a 3 computed with the aid of for¬ 
mula (48) lies outside the interval [a, fc], during the next step the end of 
the interval closest to it should be taken instead of this point (Fig. 
lib). 

The convergence of the improved method of chords turns out to be 
much better than that of the usual method, i e. if £, is the root of the 
equation /(x) = 0, then 

\a n+1 -^\<C\a„-^ (49) 

where 


l + l/5 
2 


* 1 618 


As an example let us use this method to solve the same equation 

x 3 + 3x — 1 = 0 


which we solved above using the method of chords The first approxi- 

48 


mations a : — 0.25 and a 2 = 0.31 are the same as in the usual mothod 
of chords. 

The next approximation is computed with the aid of the formula 


a i ~ a 2 ~f( a i) 


= 0 31 +0.040 


^ 2-^ 1 

0 31-0.25 
- 0.040 + 0.234 


= 0 3223 


We have /(0 3223) = 0 0004. Clearly, x = 0.3223 is the root 
sought to an accuracy of 0.0001. 


13. Derivative of Polynomial 

In solving the equation f(x) — 0 with the aid of the method of ite¬ 
rations much depends on how the equation is reduced to the form x = 
= (p(x). In many cases the best method is that proposed by Newton. It 
is based on the concept of the derivative. In this section we shall talk 
about what is meant by the derivative of a polynomial. This will 
enable us to use Newton’s method for the solution of algebraic equa¬ 
tions, i.e. of equations of the form 

a 0 x* + a x x k ~ 1 4- ... +a k = 0 (50) 

Let 

f(x) = a 0 x k + ajX* “ 1 + ... +a k 
be a polynomial Consider the polynomial f(x + a), i. e. the expression 
a 0 (x + af + a x (x + af " 1 -j- ... +a k (51) 

If we remove the brackets in expression (51) we find that in some of 
the terms there are no as at all, some terms contain them raised to the 
first power, some to the second, etc. Let us group together the terms 
containing a to the same power. Then the polynomial f(x + a) will 
assume the form 

f(x + a)=/o(x)+/ 1 (x)a+/ 2 (x)oc 2 + ... +f k (x)a k (52) 

(since'the degree of the polynomial /(x) is k the highest power of a in 
the expansion (52) is also k). Evidently,/ 0 (x), ...,^(x) are polynomials 
in x as well. 


i - ini 
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Example . Let 

f(x) = 2x 3 - 3x 2 + 6x - 1 

Then 

f(x 4 a) = 2(x 4- a) 3 - 3(x 4 a) 2 4 6(x 4 a) — 1 = 
= 2(x 3 4 3x 2 a 4 3xa 2 4 a 3 ) — 3(x 2 4 2xa 4 a 2 ) 4 
4 6(x 4 a) — 1 = (2x 3 — 3x 2 4 6x - 1) 4 
4 (6x 2 — 6x 4 6)a 4 (6x — 3)a 2 4 2a 3 
Consequently, in this case 

/ 0 (x) = 2x 3 — 3x 2 4 6x — 1 


f x (x) = 6x 2 — 6x 4 6 
f 2 (x) = 6x - 3 

h (*) = 2 

We see that the term/ 0 (x) coincides with/(x). This is not a chance 
coincidence. If we put a = 0 in equation (52) we obtain /(x)=/ 0 (x). 

Let us now turn to the next term f x (x)x The coefficient of a, i. e. the 
polynomial / x (x), is called the derivative of the polynomial f(x). For in¬ 
stance, the derivative of the polynomial 2x 3 — 3x 2 4 6x — 1 is 6x 2 - 
— 6x 4 6. The derivative of a polynomial is usually written as /' (x). 

Hence, the derivative /' (x) of the polynomial f(x) is the coefficient of 
a in the expansion of the polynomial f(x 4 a) in powers of x 

Using the notation introduced above we may re-write formula (52) 
in the following form: 

f(x + a) =/(x) + /' (x)a + ... (53) 

The dots in the above formula denote terms containing a 2 , a 3 , a*. 

For instance, 

2(x + a) 3 — 3(x + a) 2 + 6(x + a) — 1 = 2x 3 — 

— 3x 2 + 6x — 1 + (6x 2 — 6x + 6)a+ ... 

We have introduced the concept of the derivative of a polynomial 
/(x). Let us now show how to calculate this derivative. To do this con¬ 
sider the polynomial 

f(x + ol) = a 0 (x + af + a 1 (x + af ~ 1 + ... 

... + a k _! (x + a) + a k 
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Substituting for each term the expression (x 4 a) m = x m + 
mx m_1 a + ... (see Sec. 6) we obtain 

/(x + a ) = a 0 (x k + kx k ~ l a+ ...) + 

4- a^x* _1 + (k — l)x* “ 2 a 2 + ...]+ ... 

... +a k _ l (x + a) + a k = a 0 x k + a l x k ~ 1 + ... 

... + a k 4- a[/ca 0 x* “ 1 4- (k - l]a x x k ~~ 2 + ... +a k _ 1 ]+ ... 
Comparing this equation with (53), 

/(x + a)=/(x) + a/(x)+ ... 

we can make the following statement: 

The derivative of a polynomial 

f(x) = a 0 x k + 1 + ... + a k _ 1 x4a fc (54) 

is of the form 

f'(x) = ka 0 ^ “ 1 + (k — l]a x x k “ 2 -1- ... (55) 

For instance, the derivative of the polynomial 
f(x) — 6x 7 + 8x 3 — 3x 2 — 1 
is 

/' (x) = 42x 6 4- 24x 2 — 6x 


14, Newton s Method for Approximate Solution 
of Algebraic Equations 

Let us return now to the approximate solution of algebraic equa¬ 
tions. Suppose we have an equation 

a 0 x* + ~ 1 4- ... 4a k = 0 (56) 

Suppose we have somehow managed to find an approximate value x x 
of the root of this equation. We shall show how a more accurate value 
of this root can be found. Let a x be the error in the value x u i. e. sup¬ 
pose x^aj is the root of equation (56). Then we must have 

a 0 (*i + *it + (*i + Uif ” 1 + ... +flfc = 0 (57) 

In other words, 

fix i + «i) = 0 
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where f(x) denotes the polynomial 

a 0 x* 4- a jX* “ 1 + ... 4- a k 
But according to formula (31) we have 

/(x,+ a,)=/(xi)+a,/'(x,)+ ... 

where the dots denote terms containing a?, ..., o^. Hence, to deter¬ 
mine aq we have the equation 

/(*i + «i)=/(*i)+“i/'(*i)+ •=° < 58 ) 


If the initial approximation x x is good enough, its error oq will be 
small. In this case the terms in equation (58) denoted by dots will be 
small in comparison with oq. Neglecting these terms we obtain an 
approximate equation to determine oq: 


It follows from this that 


oq ~ - 


/(* i) 


(59) 

(60) 


Therefore the formula for the improved value x 2 of the root of our 
equation will be 


*2 = *1 


/(*i) 

/'(*i) 


(61) 


Then we may again improve the approximation obtained. The for¬ 
mula for the third approximation is 


x 3 = x 2 - 


fix t ) 
/'(* 2 ) 


In general, if the n-th approximation x„ for the root sought has been 
found, the formula for the next approximation is 


/(*„) 

f'(\) 


(62) 


In detailed form the formula is written down as follows: 


_ Qqx!; + a,** 1 + ... + a t -iX. + Ok 
" +1 X " ka 0 ^,~ l + {k - 4- ... 4- a k -, 


(63) 
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If the approximate values of x„ and x n+1 coincide within the accur¬ 
acy specified, our process (within the limits of accuracy specified) will 
be concluded and the value of the root sought will have been found. 

The method of solving equations described above is due to the 
famous English mathematician Newton. 

Newton’s method is closely connected with the method of ite¬ 
rations. Specifically, if the functions y = f(x) and y=f'(x) have no 
common roots, the equation f(x) = 0 will be equivalent to the 
equation 


/to 

/'(*) 


(64) 


Applying the method of iterations to this equation we obtain 
a sequence of numbers Xj, x 2 , ..., v n connected by the same relation 


/(*„) 

/'<*,» 


(65) 


as in Newton’s method. In other words, Newton’s method consists in 
writing down the equation f(x) = 0 in the form of (64) and applying to 
it the method of iterations. 

Example. Use Newton's method to solve the equation 

x 3 — 3x — 5 = 0 


with an accuracy of 0.001 choosing the first approximation Xj = 3. 
Since the derivative of the polynomial 

f(x) = x 3 — 3x — 5 

is the polynomial 

f(x) = 3x 2 — 3 

formula (62) assumes the form 



Therefore 


x 2 = 3 


27-9-5 13 

27-3 24 


2.46 


=246 = 246 - °- 165 = 2295 
lo.lo — 3 


x 4 = 2.295 ^. 8 - - 6 ' 885 = 2.295 - 0.016 = 2.279 

lj.oUl — 3 
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x 5 = 2 279 - 


11.837-6 807-5 


15 582-3 

We see that with an accuracy of 0.001 


= 2219 


X4-X5 

Therefore the root of the equation x 3 — 3x - 5 = 0 is equal to 2.279 
with an accuracy of 0.001. 

The method of approximate computation of roots presented in Sec. 6 
is a particular case of Newton’s method. Indeed, finding j fa just 
means solving the equation 

x* — a = 0 


But the derivative of the polynomial x* — a is kx k 1 and therefore for¬ 
mula (62) for the equation 

x k — a = 0 


takes the form 


=x n 


= a + (k — l)x* 
kx k n ~ 1 kx^ 


This is just the formula that was used to compute the approximations 



Note the following essential difference between the solution of the 
equation x* - a = 0 and the solution of the general algebraic equation 

aoX* 4- ~ 1 -1- ... +a k =0 


For the equation x* — a = 0 the choice of the initial approximation x { 
was of no importance. No matter what value of x l was chosen, after 

a certain number of steps we obtained the value of }fa with the accur¬ 
acy specified. In the case of the solution of equation (56) the situation 
is different. Here one initial value results in one root, another initial 
value results in a different root while some initial values result in no 
definite value at all—the sequence of numbers x 2 , ..., x m ... com¬ 
puted using formula (62) does not tend to any definite limit, that is, it 
diverges. 

15. Geometrical Meaning of Derivative 


So far we have given Newton’s method only for algebraic equa¬ 
tions. In order to generalize it to apply to equations of arbitrary form 
we shall have to generalize the concept of the derivative and introduce 
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it for functions of all types. To do this let us explain the geometrical 
meaning of the derivative. 

Consider the graph of the polynomial 

y = a 0 x k + a 1 x k ” 1 + ... +a k 

and take two points M and N on this graph (Fig. 12). Let the abscissa 
of the point M be x and the abscissa of the point N be x + a. Then the 



ordinates of the points M and N will be given by the expressions 
/(x) = a 0 x' t + a 1 x* _ 1 + ... +a k 

and 

f(x 4- a) = a 0 (x + af + a x (x + <xf ~ 1 4- ... 4 a k 

respectively. Draw a secant through the points M and N and calculate 
its slope /c S ec* It may be seen from the drawing that 

TN 

tan ^WT 

But the segment MT is equal to the difference in the abscissas of the 
points M and N and therefore 

MT= (x+ a) — x = a 

The segment 77V is equal to the difference in the ordinates of these 


* By the slope of a line is meant the tangent of the angle of inclination of the 
line with the positive direction of the x-axis. For instance, if a line makes an 
angle of 60° with the x-axis, its slope is equal to |/3. 
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points and therefore 


It follows that 


TN =/(x + a) —f(x) 


tan 4 f = 


TN 

MT 


Hx + a ) -f(x) 
a 



But by formula (53) 

/(x + a)=/(x)+of(x) + ... 
where the dots denote terms containing a 2 , a 3 , .. 

*/'(*) + 


tan \|/ = - 


=/'(*) + 


Therefore 


where the dots now denote terms containing a, a 2 . 

Hence, the slope of the secant MN is expressed by the formula 


/csec = tan \J/ =/' (x) + ... (66) 

Let us now start decreasing the value of a. In doing this secant MN 
will turn around the point M. In the limiting case, when a = 0, the 
secant will become the tangent to the curve y = f(x ) at the point M. 
Figure 13 shows the positions of the secant for a = 1; 1/2; 1/4. 

But when a = 0 all the terms denoted by dots in formula (66) vanish. 
Therefore the slope of the tangent to the graph of a polynomial y = 
= f(x) at the point with abscissa x is expressed by the formula 


^tan —f( x ) 


(67) 
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Hence, the derivative of a polynomial f(x) is equal to the slope of the 
tangent to the graph of the polynomial at the point with abscissa x. 

Example. Fird the angle that the tangent to the graph of the 
polynomial 

f(x) = x 3 — 4x 2 + 5x + 1 

drawn at the point x = 2, makes with the x-axis. 

Since 

f(x) = 3x 2 — 8x + 5 

/'(2) = 1. Consequently, tan <p = 1 and the angle is (p = 45°. 


16. Geometrical Meaning of Newton s Method 

We can now clarify the geometrical meaning of Newton’s 
method for the approximate solution of algebraic equations. Suppose 
we need to solve the equation f(x) = 0 where f(x) is a polynomial. 
Geometrically, this problem is the problem of finding the points of 
intersection of the graph of the function y = /(v) with the vaxis, 
i.e. the points where y = 0 



Suppose an approximate value of the root of this equation x t has 
already been found. Draw a tangent to the curve y = f(x) at the point 
N with abscissa x v If the choice of Xj is fortunate, the point T of inter¬ 
section of the tangent with the x-axis will be nearer to the point of in¬ 
tersection of the curve y = f(x) with the x-axis than the point M (Fig. 
14). 

To find the abscissa x 2 of point T consider the triangle TMN. The 
side MN of the right-angled triangle is just the value of the function 
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y = f(x) at the point x l9 i.e. MN = f(x x ). On the other hand, the side 
TM is equal to x y — x 2 . Hence, the tangent of the angle cpj which the 
tangent line makes with the x-axis is expressed by the formula 

/(*,) <(LO\ 

tarupj =- (oo) 

X l" X 2 

From (68) it follows that 


/(*,) 

tantp l 


(69) 


But tan (pj is the slope of the tangent to the curve y = /(x) drawn at the 
point having abscissa x v Therefore, in accordance with the geometri¬ 
cal meaning of the derivative, tancpj =f'(x l \ 

Hence, formula (69) may be re-written in this form: 


Thus we have found the second approximation for the root sought. 
Let us now draw a tangent to the curve y — f(x) at the point with abs¬ 
cissa x 2 . The abscissa of the point of intersection of this tangent with 
the x-axis is given by the formula 


In general, if the approximation x„ has already been found, then in 
order to obtain the next approximation x„+ { one should draw a tan¬ 
gent to the curve y = /(x) at the point with abscissa x„. The abscissa of 
the point of intersection of this tangent with the x-axis will then pro¬ 
vide us with the value of x„ +1 . 

The formula for calculating x n+1 is 


or, equivalently, 



/(*„) 
tan (p n 


(70) 


(VO') 


where cp„ is the angle that the tangent to the curve y = f(x) at the point 
with abscissa x„ makes with the x-axis. This formula coincides with 
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formula (62) of Newton’s method. Thus, we have found the geometri¬ 
cal meaning of Newton’s method The essence of it is the substitution 
of the tangent to the curve y = f(x) for its arc. For this reason another 
name for Newton’s method is the method of tangents . 



Figure 15 shows how the points x l% x 2 , x„, obtained using New¬ 

ton’s method, approach the point £ of intersection of the curve y = 
=/(x) with the x-axis. 


17. Derivatives of Arbitrary Functions 

The geometrical interpretation of Newton’s method given above 
enables us to generalize it to apply to any equations of the form f(x) = 
= 0, where f(x) may now be some function other than a polynomial. 
To find the solution of this equation take some approximate value x x 
of its root. Draw a tangent to the curve y = /(x) at the point with ab¬ 
scissa and denote its point of intersection with the x-axis by x 2 . 
Draw a new tangent to the curve y = f(x) at the point having abscissa 
x 2 , etc. It may easily be established that, as in the case of a polynomial, 


/(*) 

tan cp n 


(71) 


where tan (p„ is the slope of the tangent to the curve y =f(x) at the 
point with abscissa x„. 

Formula (71) is still of no use for calculations because we don’t 
know how to find tan (p„. So we have to learn to calculate the slope of 
tangents drawn to graphs of arbitrary functions y = f(x) (and not only 
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to graphs of polynomials). Let us first find the slope of the secant. Let 
M be a point on the graph of the function y = f(x) and MN a secant 
passing through this point. Applying the same reasoning as in the case 
of polynomials we conclude that the slope of the secant is given by the 
formula 


/csec= tan \|i = 


Ax + a) -/(*) 

a 


(72) 


where x is the abscissa of the point M and x + a the abscissa of the 
point N If we decrease a, the secant will turn around the point M un¬ 
til it eventually occupies the position of the tangent to the curve y = 
=/(x) at this point (see Fig. 12). Therefore we may write that 

, /(* + «)“/(*) 

Man= tan cp =lim- ( 15) 

ot-*0 a 

We shall call the limit on the right-hand side the derivative of the func¬ 
tion /(x), and denote it by /*'(x), * e we shall put 


f'(x) — lim 
a -*0 


/(x + a)~/(x) 


(74) 


Now we may write the equality (73) in the form 

/ctan = tan(p=/'(x) (75) 

Thus, the derivative /'(x) of any function at some point (not just of 
a polynomial) is equal to the slope of the tangent drawn at this point 
to the curve y = /(x) * 

Since tan cp„ =/'(x„) formula (71) may be re-written in the form 


x 


+1 



(76) 


This formula coincides with formula (62). Thus, Newton’s method has 
been extended to apply to all equations of the form /(x) = 0. 


*If it is impossible to draw a tangent to the graph of the function y = f(x) at 
a point with abscissa x (for instance, if the graph has a break at this point, i.e. 
it bends sharply at an angle) then the function has no derivative at this point 
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18. Computation of Derivatives 

We have seen in the preceding section that to find the slope of the 
tangent to the curve y = f(x) we have to compute the limit 


/'(x) = lim 
a -»0 


/(* + a) -/(*) 

at 


This computation is, generally speaking, quite difficult. But for 
many important cases the limit has already been worked out. In other 
words, the derivatives of the more frequently used functions are 
known. Below is a list of the most common derivatives. 


1 (a)' = 0 
2. (^)' = " 1 
3 (a 1 ) * — a x In a 

4. (sin ax)' = a cos ax 

5. (cos ax)' = — a sin ax 

a 

6. (tan ax)' =-— 


7. 

8 


(cot ax)' = 
(log„x)' = 


a 

sin 2 ax 

1 

x lna 

a 


9 (arcsin ax)' = — __ 

y 1 — a 2 x 2 


10. (arctanax)' =-i-r- 

1 1 + a 2 x 2 


(In a in formulas 3 and 8 denotes the logarithm to the base e = 
= 2.71828...— the so-called natural , or Napierian logarithm ). Note 
that k in formula 2 may be not only a natural number, but any real 
number. For example. 


(l/x)'=(X l ' 2 )' = yX ,a 

(-t) =(*~ 2 ) = -2x’ 3 


1 


2]fx 


2 


X 


3 


The formulas 1-10 are insufficient to calculate the derivatives of all 
functions. However, if the function f(x) is made up by using arithmeti¬ 
cal operations of functions whose derivatives we know how to find 
then we may easily find its derivative. To do this we use the following 
rules which are proved (together with formulas 1-10) in higher mathe¬ 
matics courses 

1. The derivative of the sum of two functions is equal to the sum of 
their derivatives, i.e. 

[/l (x) +/z(x)] ' —fl{x) +/ 2 '(x) 
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2. A constant multiplier may be taken outside the derivative sign : 

\af(x)Y = af'(x) 

3. The formula for calculating the derivative of a product of two Junc¬ 
tions is 

Ifl {x)fl (*)]'=/l' (x)f 2 (x) +/i (x)fi (x) 


4. The formula for calculating the derivative of a fraction is 


I /i (*) 1' fi (x)fi (x) - /, (x)/ 2 ' (x) 

L/aWj [AM ] 2 

The rule given in Sec. 13 for calculating the derivative of a polyno¬ 
mial is a corollary of rules 1 and 2 and formula 2 of the list. 
Example 1. Find the derivative of the fraction 


/(*) = 


3x 2 — x + 1 
2x^+5 


Applying rule 4 we obtain 


/'(*) = 


(3x 2 — x + l)'(2x 3 + 5) — (3x 2 — x + l)(2x 3 + 5)' 
(2x 3 + 5) 2 


Next, applying the rule for the differentiation of a polynomial, we find 

(3x 2 — x + 1)' = 6x — 1 

and 

(2x 3 + 5)' = 6x 2 

and therefore 


/'(*) = 


(2x 3 + 5)(6x — 1) — (3x 2 — x + l)6x 2 
(2x 3 + 5) 2 


- 6x 4 + 4x 3 + 6x 2 + 30x - 5 
= (2x 3 + 5) 2 

Example 2. Find the derivative of the function 


/(*) 


1 

To 


| aresin 3x — 



Solution. Using formulas 2 and 9 and rules 1 and 2 we obtain 


/'M = 


1 3 

To |/nr j? 


1 / — 2 \ 3 1 

10 V ^ 3 / 10j/I — 9x 2 + 5x 3 
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Example 3. Find the derivative of the function 
f(x) — 10* sin 2x 

Applying rule 3 and formulas 3 and 4 we obtain 
/' (x) = (10*)' sin 2x 4- 10* (sin 2x)' = 

= 10* sin 2x In 10 + 10**2 cos 2x = 

= 10* (sin 2x In 10 + 2 cos 2x) 

The rules presented above enable the derivative to be found in 
many cases. There is one other very important rule — the rule for the 
calculation of the derivative of a complicated function. It is formu¬ 
lated as follows: 

If a Junction y —f(x) may be written in the form y = F(z) where z = 
= (p (x), then its derivative is given by the formula 

/'(x) = F'(z)(p'(x) (77) 

where z = (p (x). 

Example. Find the derivative of the function y = sin(x 3 ). This 
function may be written down in the form y = sin z where z = x 3 . The 
derivative of the function F(z) = sin z is F' (z) = cos z and the deriva¬ 
tive of the function <p(x) = x 3 is cp'(x) = 3x 2 . Using formula (77) we 
obtain 

[sin (x 3 ) ] ' = F ' (z) (p 7 (x) = cos z*3x 2 
Substituting for z its value z = x 3 we obtain 
[sin(x 3 )] 7 = 3x 2 cos(x 3 ) 

The reader can find a more detailed explanation of the concept of 
the derivative, for instance, in the book by Ya. B. Zeldovich Higher 
Mathematics for the Beginner . 

19. Finding the First Approximations 

Let us deal now with the choice of initial approximations. When 
solving the Equation /(x) = 0 the first approximation may be found 
graphically. For this one should plot the function y — f(x) and find the 
points of intersection of the graph with the x-axis (at these points y = 
= 0 and therefore /(x) = 0). 

If for some reason or other it is inconvenient to plot the function 
(for instance, if the equation is being solved by a computer), another 
method of finding the first approximation is employed In this method 
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the values of the function for some values of its argument are com¬ 
puted (for instance, for integral values of the argument lying inside 
definite bounds). If the function y = f(x) is continuous (i e. its graph 
has no discontinuities), then there is a root of the equation f(x) = 0 
lying between the values a and 6 of the argument for which the func¬ 
tion assumes values of opposite sign (Fig. 16a). If the graph of the 
function contains discontinuities, it may happen that the function 
moves from the negative values to positive ones in a jump without 
passing through zero on the way (Fig. 166) The values a or b may be 
taken as first approximations for the root of the equation f(x) = 0 




fig 16 

Note that by this method we may leave out some of the roots of the 
equation. For instance, Fig. 16c depicts the case when the function 
y = f(x) has the same sign at two points, but vanishes between them. 

We have thus obtained two points: a and 6. To find out which of 
them should be&t be chosen as the initial approximation x in Newton’s 
method, consider Fig. 17. Figures 17a and 176 show that if the curve is 
concave up whichever of the two points a and 6 for which the function 
f(x) is positive should be chosen as the initial approximation. A different 
choice of the initial approximation may even result in the point x 2 being 
outside the interval [a, 6], Similarly, if the curve is concave down, the 
point where the function f(x) is negative (Fig. 17c, d) should be chosen as 
the initial approximation . 

This rule may be conveniently used if the graph of the function y = 
= f(x) is known. If, on the other hand, the function has not been plot¬ 
ted, then additional calculations are necessary to determine the direc¬ 
tion of the function’s concavity. In order to do this, the second deriva¬ 
tive of the function f(x) should be found. By the second derivative of the 
function f(x) is meant the derivative of its first derivative. 

For instance, if we are given the function 

f(x) — x 3 — 4x 2 + 3x — 1 
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its first derivative is 


f(x) = 3x 2 — 8x + 3 

and the second 

/"(v) = 6x —8 

Higher mathematics courses contain the proof ihat if the second 
derivative is positive, the curve on this interval is concave up. If, on the 



(c) (d) 


Fig 17 

other hand, the second derivative in the interval [a, b~\ is negative, the 
curve is concave down. Making use of this information we obtain the 
following rule for the use of Newton’s method: 

Let the function f(x) have opposite signs at points a and b and let the 
second derivative of the function f(x) be positive on the interval [a, ft]. 
Then for the initial approximation x^ one should choose that of the points 
a orb for which the function f(x) assumes a positive value. If on the other 
hand, the second derivative is negative on the interval [a, ft], then the 
point where the function f(x) assumes a negative value should be chosen 
for the initial approximation x v 


5-301 
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20. Combined Method of Solving Equations 

In solving equations Newtons method is often combined with the 
method of chords. If the graph of the function y = f(x) is concave up, 
then the points a { and x x are found using formulas 


a i — a ~f( a ) 


b-a 


(78) 


= b- 


— (79) 

f’(b) 

If, on the other hand, the graph of the function y = f(x) is concave 
down, then formula (78) is used to find the point a , and the formula 


/(a) 

/'(a) 


(80) 


to find x v 

As may be seen from Fig. 18a and 18fr, the root £, of the equation 
f(x) = 0 usually lies between the points a } and x v After Newton’s 




1 ig 18 

method and the method of chords are again applied, we get a new pair 
of points a 2 and x 2 , etc. 

In this way two sequences of points are obtained: a l5 a 2 , ... a m 
... and Xj, x 2 ,..., x n ,... which approach the root £ sought from differ¬ 
ent sides. The advantage of the method described is that it approxi¬ 
mates the roots from both above and below. 

Example. Using the combined method solve the equation 

x — sin x — 0.5 = 0 


with an accuracy of 0.001. 
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Compile a table of values of the continuous function 
f(x) = x — sin x — 0.5 


X 

-1 

0 

1 

2 

/(*) 

-0 659 

-0.5 

-0.341 

0.591 


It follows from this table that the root of this equation lies between 
1 and 2. Using formulas 2 and 4 of Sec. 18 we obtain 

/' (x) = 1 — cos x 

Therefore Newton's formula assumes in our case the form 


x — sin x — 0.5 


1 — cos x 


(81) 


To find out which of the values, 1 or 2, should be taken for x 0 let us 
find the second derivative of the function /(x). According to formula 
5 of Sec. 18, it is of the form /" (x) = sin x. But on the interval [1, 2] 
the function sinx is positive*. Consequently, using the rule established 
above, the value 2 for which the function /(x) is positive should be 
taken for x 0 . 

According to formula (81) we have 


2 — sin 2 — 0.5 2-0909 -05 

x = 2-= 2- 

1 - cos 2 1 + 0.416 

On the other hand, using formula (78) we obtain 


1.583 


= i _(_o.341> 


2-1 

0 591 - (-0.341) 


1.366 


Then applying formulas (81) and (78) to the interval [a v xj we 
obtain 


*2 


1 583 - 


1.583 - 1000-05 
1 + 0.012 


1.501 


♦The function sinx is positive on the interval [0, ti] = [0, 3.141 .]. There¬ 

fore sinx is positive also on the subinterval [1, 2] of this interval. 
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and 


a 2 


1 366 + 0.113 


1 583 - 1 366 
0.083 + 0.113 


= 1.491 


Next we find 

x 3 = 1.498 
a 3 = 1.498 

Hence the root of our equation with an accuracy of 0.001 is 1.498. 


21. Convergence Test for Method of Iterations 

Let us now apply the concept of the derivative to the derivation of 
a new convergence test for the method of iterations. We shall need for 
this purpose another formula called the Lagrange formula (Lagrange 
was a French mathematician of the 18th century) 



Consider the curve y = f(x) on the interval \_a, b]. Denote by M the 
initial point of this curve and by N its end point and draw a chord 
MN. The slope of this chord is 

PN 

^chord = tan \|/ = 

Mr 

(Fig. 19) But MP = b — a and PN = f(b) — /(a), and so 

^chord — , 

b - a 

Let us now denote by T the point of the arc MN furthest from the 
chord MN. If we draw a straight line parallel to the chord through this 
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point, it will be tangent to the curve, for if it intersected the curve it 
would mean that there were points more distant from the chord MN 
than the point T In other words, the tangent to the curve at point Tis 
parallel to the chord MN and has therefore the same slope as the 
chord. But the slope of the tangent is equal to j'{c) where c is the ab¬ 
scissa of the point T. So we have the formula 


f'(c) = 


b — a 


( 82 ) 


This formula is called the Lagrange formula. Note that the point 
c in the Lagrange formula always lies between the points a and b . The 
Lagrange formula may also be written in the form 

f(b)-f(a)=f(c)(b-a) (83) 

Let us return now to the solution of the equation x = cp(x) by the 
method of iterations. Suppose the mapping y = (p(x) maps the interval 
[ a , b] into itself so that on this interval the inequality |cp'(x)| < q 
holds where q is a number less than unity, q < 1. Take any two points 

and x 2 of the interval [a, b]. Then the points <p(xj) and (p(x 2 ) will 
also belong to the interval [a, b]. Using the Lagrange formula we 
obtain 

(p(x 2 )-cp(xi) = (p'(c)(x 2 -x 1 ) 

where c is some point lying between x 1 and x 2 and so belonging to the 
interval [a, b]. The inequality | cp'(c) | < <7 < 1 holds and therefore 

|cp(x 2 )-(p(xi)| ^q|x 2 -Xi| (84) 

Inequality (84) shows that cp(x) is a contraction mapping. But we 
know that if x -► cp(x) is a contraction mapping of the interval [a, b] 
into itself, then for any point x 0 of this interval the sequence x 0 , x u .... 
.., x m ... where x n+1 — cp(x n ) converges to the root of the equation 
x = (p(x). Thus, we have proved the following theorem. 

Theorem . Let the function y = (p(x) be the mapping o) the interval 
[a, b] into itself and suppose in this interval the inequality |(p'(x)| < q, 
where q < 1, holds . Then for any point x 0 of the interval [a, b] the 
sequence of points x 0 , x u ..., x„, ..., where x n+1 = (p(x„), converges to 
the root of the equation x — <p(x). 

Roughly speaking the theorem just proved says that the process of 
successive approximations enables us to find those roots £ of the equa¬ 
tion x = (p (x) for which the inequality | <p'(£) | < 1 is satisfied. One 
could say that these points attract the broken line (open polygon) 
which geometrically depicts the iteration process (see Sec. 8 ) while the 
points for which [(p'(^)|>l repulse this line. 
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If the inequality | (p '(x) | < q < 1 is satisfied on the entire number 
axis, the iteration process converges independently of the choice of the 
initial approximation x 0 (see Sec. 10). 

Example 1. Can the iteration process be applied to the solution of 
the equation 

cos x -l- sin x 
x —- 

4 

In this case 

. . cos x -I- sin x 
<P(*) =- 4 - 

Therefore 

„ . — sin x + cos x 

-4- 

But |sinx| < 1, |cosx| < 1, therefore 


and the iteration process can be applied. 

Example 2. Can the iteration process be applied to solve the 
equation 

x = 4 — 2* (85) 

The root to be found lies in the interval [1, 2] since the continuous 
function y = x — 44- 2 X changes sign in this interval: 

1 — 4 + 2 1 <0 while 2-4 + 2 2 >0 
But in this case we have 

cp'(x)= — 2*ln2 

Let us evaluate the expression 2* In 2 in the interval [1, 2], If 
1 < x < 2, then 2 ^ 2 X ^ 4 

and therefore 

2 In 2 < 2* In 2 < 4 In 2 

From the tables of Napierian logarithms (their base is the number e » 
** 2.78...) we find ln2 = 0.69 ... . Consequently on the interval [1, 2] 
the inequality 

1.38 ... *S2*ln2^2.76 ... 
holds and so the iteration process is inapplicable. 
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In order to be able to apply the iteration process we can transform 
equation (85). Rewrite it in the form 

2 X = 4 — x 

and take the logarithms to the base 2 of both sides. Then we shall 
obtain 


x = log 2 (4 - x) 


In this case 


<P '(*) = — 


(4 - x)ln 2 

and on the interval [I, 2 ] the inequality 


l<, ' wl< 2ih-738 < ' 

holds. (The reader may easily derive the inequality himself.) 

Therefore when the equation is written in this way the iteration pro¬ 
cess converges. 


22. Rate of Convergence of Iteration Process* 

Let us now use the derivative of the function cp (x) to find the rate of 
convergence of the iteration process for the solution of the equation 
x = cp (x). We want to find the rate of decrease in the errors oc„ = £, — x„ 
of the approximate values x x , ..., x„, ... of the root 

Note that the equalities £ = cp (£) and x„ + x = cp (x„) are valid. It fol¬ 
lows from them that 

«„+i - k - *»+1 = <p(5) - <P(*■) 

But using the Lagrange formula we have 

<P (£) - <P (*„) = 9' (c„) ~ x„) = (p' (c„) a„ 

where c n is a point lying between the points x n and Therefore 

On+l =<p'(0« n (86) 

The following conclusion may be drawn from equality ( 86 ): 

Let ^ be the root of the equation x = cp (x) lying in the interval [a, b~\. If 
in this interval the inequality |cp '(x)| < q < 1 is satisfied, and the initial 


*The section may be omitted on first reading 
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approximation x t is also chosen in the interval [ a, b] then for any n the 
relation 


+1 «f N 


(87) 


holds. 


H- \ -♦— 

£ c 1 Xl 


Fig. 20 

Indeed, it follows from equality ( 86 ) that 
N = |<p'(ci)||cti| 

But the point c x lies in the interval [a, b] (Fig. 20) and therefore 

|<p'(ci)| <q 

It follows from this that 

|«a| <9|«i| 

In the same way we obtain 

|a 3 | = |<P'(c 2 )| \<h\ <9|a 2 | <<? 2 |ai| 

and in general 

K+i| <9"kl 

This proves our statement. 

Since for 0 < q < 1 the sequence of numbers q, q 2 ,..., q n ... tends to 
zero, the error oc„ + x also tends to zero as n increases. In other words, 
with the above assumptions the numbers x l9 x„, ... approach the 
number £, with the difference |£ — x n+l \ decreasing at a higher rate 
than (otj | q n . 

In the same way it can be proved that if in the interval [ a, b] the 
inequality 

|cp'(x)| > 1 

holds the iteration process diverges. 

The convergence rate of the iteration process is highest if the deriva¬ 
tive of the function (p' (x), vanishes at the point In this case as we get 
close to £» (p'(x) tends to zero. Since 

k + j| = h'(OlH 

the convergence rate increases as the point % is approached. 
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We have already met with a similar situation when using the 
method of iteration to extract square roots. Remember that we then 

X 2 ~h Cl 

substituted the equation x = —--for the equation x 2 = a. But the 

2x 

x 2 4- ci 

derivative of the function cp(x) = —-- is 


,, \ _ (x 2 + a)'2x - (x 2 + a)(2x)' _2x 2x - (x 2 + a)2 _ x 2 - a 
9 W “ 4? 4? 


[see rule 4 of Sec. 18 and formula (55) of Sec. 13], therefore 


<P'(l fa) 


0 

2 (]fa? 


Hence, the derivative of the function cp(x) vanishes at the point x = 
= ya and this accelerates the convergence of the process as the point 
x = |/a is approached. 

This increase in the rate of the process as the root of the equation 
is approached is also a feature of Newton’s method (a specific case of 
which is the method of extracting square roots mentioned above). In¬ 
deed, we have already noted that Newton’s method involves replacing 
the equation /(x) = 0 by the equation 


/(*) 

'* /'(*) 


and the subsequent solution of this equation by successive approxima¬ 
tions. In this case we have 


but 


<p(x) = x — 


m 

r(x) 


tp'(x) = 1 


fix) T _ f (x)\f(x)]' -/MLn*)]' 

fix) J [TMF 

[/W [/'(*)] 2 


Since at point the equation f(£) — 0 is satisfied, cp' (£) = 0 and this, as 
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was shown above, accelerates the convergence of the approximation 
process as the point is approached 


23. Solving of Systems of Linear Equations by Method 
of Successive Approximations 


Up to now we have been solving equations in one unknown. Now 
we shall consider the solution of systems of equations starting with 
systems of equations of the first degree. 

Suppose we have m first-degree equations with m unknowns*: 

011*1+012*2+ ••• + «lm*m = &l I 


o 2 i*i + o 22 x 2 + ... +a 2m x m = b 2 


( 88 ) 


a m\ X i + a m2 X 2 + ••• + a mm X m = 


There are numerous applications for such systems. For instance, when 
geodesists combine the results of measurements made over large areas 
of the Earth’s surface, they sometimes have to solve systems of many 
hundreds of equations. Engineers designing rigid structures and spe¬ 
cialists in many other fields also have to solve such systems of 
equations. 

The solution of these systems by the usual methods (for instance, by 
the method of eliminating unknowns) is often very troublesome. The 
method of successive approximations turns out to be more conve¬ 
nient. To begin with here is an example of the way this can be done. 

Suppose we have a system of equations 

10x, — 2x 2 + x 3 =9 
x, -f 5x 2 — x 3 = 8 
4x j + 2x 2 + 8x 3 = 32 


* In these equations we denote the unknowns by the letters x,, x 2 , , x m 
and the coefficients by the letters a {j Here the first subscript denotes the equa¬ 
tion number and the second, the number of the unknown For instance, a xl is 
the coefficient in the fourth equation in front of the seventh unknown 
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We are required to find the unknowns Xj, x 2 , x 3 with an accuracy of 

0 . 01 . 

Let us express the first equation in terms of the second in terms 
of x 2 , and the third in terms of x 3 . Then the system will assume the 
form 


Xj=0.9+ 0.2x z — 0.1x 3 

x 2 = 1.6 — 0.2XJ + 0.2 x 3 

x 3 = 4 — 0.5x, — 0.25 x 2 

Take arbitrary values of x,, x 2 , x 3 as the initial approximations, for 
instance, put xf = 0, x l 2 0) = 0, x ( 3 0) = 0. Substitute, these values into 
the right-hand sides of equation (89) and take the values obtained to 
be the next approximate values of x ( , x 2 , x 3 . We obtain 

x* 1 ' = 0.9 
x^' = 1.6 
x 3 * * = 4 



Substitute the values thus obtained again into the right-hand sides of 
the equations (89). We obtain the approximations 

x[ 2) = 0.9 + 0.2 x 1.6 - 0.1 x 4 = 0.82 
x' 2 2 > = 1.6 - 0.2 x 0.9 + 0.2x4 = 2.22 
x< 3 2) = 4 - 0.5 x 0.9 - 0.25 x 1.6 = 3.15 

In general if the values x\"\ x* 2 *, x ( 3 1 have been found then to obtain 
the next approximations one has to use the formulas 

x [ n+ » = 0.9 + 0.2x ( 2 n) - 0.1x ( 3 b) ") 

x« 2 n+1 »= 1.6 - 0.2X*"' + 0.2x ( 3 n) V (90) 

x< 3 " +1 > = 4 - 0.5x < " ) - 0.25 x ( 2 b) j 
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The results of the computations are presented in Table 2. 

Table 2 


n 

1 

2 

3 

4 

5 

6 

Y («» 
x l 

0.9 

0 82 

103 

101 

100 

1.00 

v (n) 

x 2 

16 

2.22 

207 

200 

1.99 

200 

X 3 

4.0 

3 15 

3 03 

2 97 

300 

300 


We see that, with the accuracy required, the following equalities 
hold: 


x< 5) = x< 6) , x ( 2 5) = x< 2 6) , x ( 3 5 > = x ( 3 6) (91) 

Putting in equations (90) n = 5 and taking account of the equalities 
(91) we obtain that within the accuracy reguired 

x\ 5) * 0.9 + 0.2x ( 2 5) - 0.1x ( 3 5) 
x 2 5) * 1.6 - 0.2x ( j 5) + 0.2x ( 3 5) 
x< 3 5) « 4 - 0.5x ( , 5) - 0.25x 2 5) 

(in fact, the equations are satisfied exactly but this is unimportant). It 
follows that the numbers x\ 5) = 1.00, x ( 2 5) = 2.00, x ( 3 5) = 3.00 are (within 
the accuracy required) the solutions of the given system of equations. 

The same method is used in the general case *. Suppose we have 
a system of equations (88). Transpose x, in the first equation, x 2 — in 
the second, etc. Then system (88) assumes the form 



x 


m 



*1 



(92) 


*The text below down to the end of the section may be omitted on first 
reading. 
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Let x\ l \ be initial approximations for the unknowns x 1# x m 

Substituting these values into the right-hand sides of equations (92) 
we obtain second approximations x (2) , ..., x^ ] for the unknowns 
sought: 


y ( 2 ) _ 
* i — 


b, 


„( 2 ) _ 
X 2 ~ 


_ W - W ~ 1 y(D 


In the same way, if the n-th approximations xf\ x%\..., have 
already been found, the formulas for the next step are 


V<" * 1) . 

*1 


v<« + l). 
X 2 




b2 




Y (»). 




*22 


X 


("+ 1 ) _ 
m 




a 


m.m - 1 


a 


mm 




(93) 


Consider now a problem which deals not with the solution of a sys¬ 
tem of linear equations by the method of successive approximations, 
but with finding the limiting state of some process of approximations 
by means of solving a system of equations. 

There are three buckets . The first contains 12 litres of water, the 
second and the third are empty. Half of the water in the first bucket is 
poured into the second, then half of the water in the second bucket is 
poured into the third, then half of the water contained in the third bucket 
is poured into the first. This cycle is repeated 20 times. It is required to 
find (with an accuracy of 0.0001 /) how much water there will be in each 
bucket. 

Evidently, this problem deals with successive approximations to 
some limiting distribution of water. This limiting distribution is charac¬ 
terized by the fact that it does not change as the result of one of the 
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pouring cycles described. If at the start of a cycle there are x litres of 
water in the first bucket, y in the second and 12 - x — y litres in the 
third (the total amount of water does not change as the result of the 
pouring) then the above cycle will be described by the following table: 




1 st bucket 

2 nd bucket 

3 rd bucket 

Initial 

state 

X 

V 

12 - x - y 

After 1st 

pouring 

V 

2 

2 + ) ’ 

12 — x - ) 

Aftei 2nd 

pouring 

X 

2 

4 + > 2 

i 0 3 v 

, 2 -4- x "2 

After 3rd 

pouring 

r ,X \ 

6 + 8“4 

H 

6 - 5 x - \ 
8 4 


For the amount of water in each bucket to remain unchanged as the 
result of such a pouring cycle, the following equations should be 
satisfied: 



Solving these equations we find that x = 6, y = 3. Hence, the limit¬ 
ing distribution is such that the first bucket contains 6 1 of water, the 
second — 3 1 and, consequently, the third, too, contains 3 1 of water 
Let us see what the rate is at which a specific distribution of water 
approaches the limiting values. Suppose the first bucket contains 
a litres and the second b litres. After one cycle there will be 

a b 

a =6 +-- (94) 

1 8 4 

litres in the first bucket and 



in the second. 

Denote a — 6 by a, b - 3 by (3, a x - 6 by cl 1 and b { - 3 by Then it 
follows from equations (94) and (95) that 
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and 


a - 6 b — 3 



8 4 


Pi=*i 


-3 = 


a — 6 

-+ 

4 


6-3 _ a p 

2 “T + T 


After the second cycle the errors will be expressed by the formulas 


_ 

. Pi . 

-U-- 

,£Y 


1 --—< 

8 

4 

8 \ 8 

4 J 

4 V 4 2) 

' 64 


5 


P 2 ~ 






“ + 1 

4 2 


So if a <e and |p <8, then 



M <^e*0.2e 

IP 2 I <^e*0 34e 

This means that two pouring cycles decrease the errors a and p at least 
three times. Therefore after 20 cycles they will decrease at least 3 10 % 
^ 70000 times. So with an accuracy of0.00011 after 20 pouring cycles 
there will be 61 of water in the first bucket, 3 1 in the second and 3 1 in 
the third. 


24. Solving Systems of Non-linear Equations Using Method 
of Successive Approximations 

The method of successive approximations (iteration) may be used 
to solve some systems of non-linear equations as well. Consider, for 
instance, the system 


* = 2 + 


x 2 + y' 
20 


y = 1 + 


x + y* 
20 


(96) 


Choose as the first approximations x 0 = 0 and y 0 = 0. Substituting 
these approximations for x and y into the right-hand sides of the equa¬ 
tions we obtain the following approximations: x, = 2, v, = 1. Substi- 
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tuting these approximations into the right-hand sides of equations (96) 
we obtain 


x 2 = 2 + = 2 25 

20 

2+ l 2 

y 2 = 1 +-= 1 15 

2 20 


Continuing the process we obtain 


2 25 2 + 1 15 

*3 = 2 +-—— =2 31 

20 

225 + 1 15 2 

y 3 = 1 + — - - - = 1 18 

20 

2.31 2 + 1 18 

x 4 = 2 + —-= 2.33 

20 

231 + 1 18 2 

*- ,+ — 20 — 118 


= 2 + - 


2 33 2 


18 


20 


= 2.33 


= 


2 33 + 1.18 2 
20 


1 18 


We see that with an accuracy of 0.01 the equalities x 4 = x 5 = 2.33 and 
y 4 = y 5 = 1.18 are satisfied. Therefore we have with the accuracy 
stated above x = 2.33 and y = 1.18. 

In general, if we have a system of equations 


X = (p(x, y) 
y = \\i(x, y) 


(97) 


where cp(x, y) and 4f(x, y) are certain functions, we choose initial 
approximations x 0 and y 0 , substitute them into the right-hand sides of 
equations (97) and then proceed with the calculations according to the 
formulas 


X n +1 =<p(*m y n ) 
y n+ 1 =Mx n . y n )- 


(98) 
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If for some number n the equalities x B+ , a x„ and y B+ , % y„ are satis¬ 
fied with specified accuracy, then we have with that accuracy x »x n , 

y*y n - 

Systems of equations containing three or more unknowns are 
solved in the same way. 

Let us now establish the conditions guaranteeing the convergence 
of the process of successive approximations in the solution of systems. 

We assume the functions <p(x, y) and ty(x, y) in the system of equa¬ 
tions (97) to be defined for some bounded closed region D of the x. 
y plane. In other words, we will assume that the region D lies entirely 
inside some square, and that it contains all its boundary points. A cir¬ 
cle, a polygon (together with the broken line bounding it), an ellipse, 
etc. may serve as examples of such regions. We shall, in addition, 
assume the functions <p(x, y) and v|/(x, y) to be continuous in the 
region D. The functions <p(x, y) and \J/ (x, y) define a mapping of the 
region D into some other region lying in the same plane. To find the 



point into which the point M 0 (x 0 , y 0 ) of region D is mapped, one must 
substitute its coordinates for x and y in (p(x, y) and v|/(x, y). The result 
of such a substitution will be the coordinates of the image of the point 
M 0 . For instance, if 


cp (x, y) = X 2 + y 2 
\\i(x, y) = 2xy 

then the point Af 0 (l, 3) will be mapped to the point 7V o (10, 6 ). 

In future we shall denote the mapping given by the functions <p (x, y) 
and \|/(v, 3 ’) by one letter <J>, and denote the image of the point M un¬ 
der the mapping by (M). The image of the region D as a whole will 
be denoted by O(Z)). 

Suppose the mapping O maps region D into subregion D x = d>(D). 
Then the same mapping may be applied to map the region D x into 


6-301 
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subregion D 2 = O (Z) t ) which, of course, also lies inside region £>. Con¬ 
tinuing this process we obtain a, system of regions D, D ... 
.... D n , ... (Fig. 21) lying one inside the other. 

We shall call the mapping <t> a contraction if there is a number q , 
0<g< 1, such that for any two points M x and M 2 inside D the 
inequality 

r(<t>(Mj), <D(M 2 ))^r(M„ M 2 ) 

holds. Here r(M, N) denotes the distance between the points M and N. 

As in the case of a single variable the following proposition can be 
proved: 

Suppose the mapping O maps the region D into a subregion, and that it 
is a contraction. Then there will be a unique point N in the region D such 
that N = O (N). This point belongs to all the regions D n . The coordinates 
r| of the point N satisfy the system of equations (97), i. e. 

n ii) 

As in the case of a single variable, the values £ and r| are com¬ 
puted approximately by the method of iterations. If M 0 (x 0 , y 0 ) is an 
arbitrary point of region D and Af„+ x = $ (MJ [i. e. x n+ j = (p(x„, y„), 
y n+l = ^(x„, y„)] the sequence of px>ints M 0 , Af„, ..converges 

to the fixed point N of the mapping. 


25. Modified Distance 

The requirement that O be a contraction mapping is a sufficient 
condition for the convergence of the iteration process. But this condi¬ 
tion is not necessary. The mapping O may not be a contraction, but 
the iteration process may nevertheless converge. For instance, the 

x 

mapping O defined by the functions <p(x, y) = 1 + 2y, 4/(x, y) = 3 + — 

o 

is not a contraction. If we take /1(8, 0) and B( 8, 4) we obtain 

r(A, B) = 4, <D(/1) = (1, 4), <D(B) = (9, 4) 

and 

r(0(4), <D(B)) = 8>r(/4, B) 

However, no matter what point M 0 we take, the set of points M 0 , 
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M 1( .... M n , ... converges to the point 



In some cases the convergence of the iteration process can be estab¬ 
lished if the definition of the distance between the points in a plane is 
modified. Actually, there may be different definitions of the distance 
between two points. For a traveller, it would be natural to measure the 
distance by the time it takes him to get from point A to point B . In the 
case depicted in Fig. 22 a the distance between points A and B will be 



equal to the sum of the lengths of the segments AC, CD and DB (to get 
from point A to point B one has to reach the bridge CD, cross it and 
then walk from point D to point B). If the motion in the plane can take 
place only along two mutually perpendicular directions, as in Fig. 22 b, 
the distance between points A and B should be defined as the sum of 
the segments AC and CB. In other cases it is more convenient to 
define the “distance” between points A and B as the length 
of the greater of the two segments AC and CB. Other defi¬ 
nitions of the “distance” between the points of a plane can also 
be devised. (More detailed information about different definitions 
of distance can be obtained from the book of Yu. A. Schreider “What 
is distance”.). 

The distance between points r(A, B) is usually required to have the 
following properties: 

(1) The distance r(A, B) between any two points A and B is nonnega¬ 
tive, being zero only when the points A and B coincide. 

(2) For any two points A and B the symmetry condition is valid: 

r(A, B) = r(B, A) 

H* 
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(3) For any three points A , B C the triangle inequality holds: 
r(A, B)^r(A, C) + r(C, B) 

If for some set of objects a distance is defined which has these pro¬ 
perties, then the set is called a metric space , and the elements of the set 
are called points of this space. The points of a metric space may even 
be functions. For continuous functions <p(x) and v|/(x) on the interval 
[a, b] the distance may be defined as the maximum value of the func¬ 
tion |<p(x) — \|/(x)| on this interval. 

r(ip, v|/) = max | ip(x) — v|/(x)| 

a x b 

We have already seen that there are different ways of turning 
a plane into a metric space: in all of the above methods of defining dis¬ 
tance the conditions (l)-(3) are satisfied. 


Here is an interesting example of a case in which the conditions (1) and (3) 
are satisfied, but the condition of symmetry (2) is not Let us measure the dis¬ 
tance between the points A and B of a mountainous terrain by the time it takes 
to get from A to B. Since the time of ascent is not the same as the time of des¬ 
cent, r(A, B) # r(B , A) 


It turns out that the following condition is sufficient for the process 
of successive approximations to the solution of a system of equations 


x = <p(x, y) 
y = v|/(x, y) 


(99) 


to converge: 

The mapping O defined by the functions, (p(x, y) and 4>(x, y) maps 
the region D into itself and is a contraction for at least one “distance” 
r(i4, B). In other words, there should be a number q where 0 < q < 1, 
such that for any two points M l and M 2 of D the inequality 


O(M 2 ))^qr(M u M 2 ) 


holds. 

Take, for instance, the functions <p(x f y) — 1 + 2y and \)/(x, v) — 3 4- 

x 

+ —. The mapping O defined by them turns out to be a contraction 

o 

with factor 1/2 if the distance between the points A(x, yj and 
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B(x 2 , y 2 ) is defined by the formula 


r{A, 


B) - 


\^(x 2 -x l ) + 2 (y 2 -y,) 


+ 


+ 


1 

-ji x 2-x,)~2(y 2 -y i ) 


For instance, for points ,4(8, 0) and B( 8, 4) we have r(A, B) = 16 
while for their images <&(A) and d>(B) 

r(<t>(A), 0(5)) = 8 

As a result of the mapping the “distance” becomes twice as small. 
This is the reason why the process of iterations in the solution of the 
system of equations 


x = 1 + 2 y 


converges although the mapping d> is not a contraction with respect 
to the usual distance (see p. 82) 

26. Convergence Tests for Process 
of Successive Approximations for Systms 
of Linear Equations 

Let us apply the convergence test devised above to systems of linear 
equations. By choosing different types of “distance” we shall obtain 
the convergence tests for such systems expressed as properties of their 
coefficients. 

Consider, to begin with, a system of two linear equations in two 
unknowns 


a n x + a 12 y = ft, 

« 21 * + U2iy= bi 


( 100 ) 


Let a u ^ 0 and a 12 # 0- Solve the first equation for x and the second 
for y. We obtain a system of equations 


x = 


*i 


a 


11 




y 
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For conciseness put 


y = - 


b i 


P *^ 12 ” z n 

I.-=<*1, -= P2. - 


- = a. 


( 100 ') 


Then the system assumes the form 

x = a,y + p, 

y = a 2 x+ p 2 

In this case the functions defining the mapping <t> are of the form 
<p(x> y) = otjy + P„ v|/(x, y) = a 2 x + p 2 


Let us find out what the coefficients aj and a 2 should be for this 
mapping to be a contraction. 

As is well known, the distance r(A , B) between points /4(xi, yj) and 
B(x 2 , y 2 ) is expressed by the formula 

r(A, B) = j/(x 2 - x ,) 2 + (y 2 ~ ^i ) 2 

The mapping O maps the point A to the point A i (a,y, + Pi, a 2 x! + 
+ p 2 ) and the point B to the point B t (a^ + P^ a 2 x 2 + P 2 ). The dis¬ 
tance between these points is expressed by the formula 

r(A u B t ) = j/(a t y 2 - a,y ,) 2 + ((*2*2 - ^Xi ) 2 = 

= |/af (y 2 - y ,) 2 + a 2 (x 2 - x ,) 2 
Denote the larger of the numbers |a t | and |a 2 | by q: 

q ± max(|ax|, |a 2 |) 

Then the inequality 

r(A t , Bi) q ]/(x 2 - x ^ 2 + (y 2 - y ,) 2 = qr(A, B) 

follows from formula (101). 

Consequently, if q < 1, then the mapping O is a contraction on the 
entire plane. In that case, as we know, the process of successive 
approximations is convergent. 

Thus, we have proved that if 

max(|a 1 |, |a 2 |) < 1 (102) 

the process of successive approximations for the solution of the system 
of equations (100") is always convergent. 
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Let us express the convergence test derived above directly in terms 
of the coefficients of system of equations (100). To do this recall that 


Substituting these expressions into the condition (102) we arrive at the 
following conclusion: 

For the process of successive approximations to the solution of a sys¬ 
tem of linear equations 

flux + a l2 y = b l 
a 2l x + a 22 y = b 2 

to be convergent it is sufficient that the condition 


/ ai2 

*21 \ 

\ a n ’ 

tf 2 2 J 


be satisfied 

This condition means that the diagonal coefficients should be 
greater than the non-diagonal coefficients in the respective lines. For 
this reason, for instance, when solving the system 

x — 3 y= — 11 
6x + y = 10 

the first equation should be solved for y and the second for x: 


y 3 + 3*’ * 3 6 

Sometimes it is useful to carry out a preliminary transformation of 
the system of equations, substituting for the unknowns x and y other 
unknowns proportional to them. Take, for instance, the system of 
equations 

12x + y=l4) 

> (103) 

3x - 2y = - 1 j 

For it 
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and because of this the sufficient condition for the convergence of the 


approximation process is not satisfied. But if we put x = yz we 


obtain the system 


4 z + y = 14 ) 
z-2y= -lj 


(104) 


for which 



This means that system (104) can be solved by the method of succes¬ 
sive approximations. 

Of course, no one would think of solving such simple systems as 
(104) by the method of successive approximations. But for systems 
with a large number of unknowns this method is sometimes very use¬ 
ful. Sufficient conditions for convergence when solving the system 


0 u*i + a 12 x 2 + ...+ a 1 „x* = 
021*1 + 022*2 + ... + = 



0»1*1 + 0«2*2 + - + 0«n*n = K 


(105) 


are established in almost the same way as for a system of equations in 
two unknowns. In fact, the following statement holds: 

The process of successive approximations for the system (105) con¬ 
verges if one of the following conditions is satisfied : 


1) max 


2) max 


J= 1 

v'k 

iax ) — 

J Lj\ a a 


< i 


< i 


3) max 


“I 


i . 2 

0.7 


| 0 » 


< 1 


U= l 


(106) 

(107) 

(108) 
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The prime in the sums (106) and (107) means that the term for 
which i = j should be left out. The symbol (k) in the sum (108) means 
that the terms for which i = k should be left out as well as the terms for 
which i = j. Note that condition (3) must be fulfilled if 


2 



• J- 1 


(109) 


where the prime means that terms for which i = j are omitted. In most 
books on computing mathematics this condition is presented in the 
form (109). 

As we have already stated before, we arrive at conditions (106H108) 
if we require the mapping 


x 






nn 


to be a contraction with respect to some distance. Actually, the condi¬ 
tion (106) corresponds to the distance between the points A (x 1? ..., x rt ) 
and B(yj, y„) 

r{A, B) = max(|x, |x„ — y„|) 


the condition (107) corresponds to the distance r(A, B) = £ |x, — VjJ 

1=1 


and the condition (108) to the distance r(A, B ) = 


5 :(*i-y<) 2 - 


As in the case of two variables it is sometimes useful to substitute for the 
unknowns x u new unknowns proportional to them: y i =p 1 x 1 ,...» y H = 
= p n x rt> where p x >0, .... p„>0. 

In this case the conditions (106), (107), (108) take the form 


(T) max 

i 


in 


bi< 

pj 


j =i 


(106') 
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(2') max VM<1 (107') 

i Zj r« pi 

i = 1 

(3') max >K 4< 1 (108') 

» Li r« p} 

ij= 1 

In particular, when p t = \a ti |, these conditions assume the form 


(1") mp 

i 

as- 
! = 1 

(106") 

(2") max 

t|?<« 

(107") 

i 

Li \ a n 

i= l 

\ 

%ik) 2 


(3") max 

) ^ < 1 

(108") 

* L 

Lj a JJ 


ij= i 


As an example, let us consider the system of equations 
x — 0.06y — 0.5 z = — 2.6 

— 0.2x + y —OAz - 3 
-0.1x + 0.5y + z = 3.9 

The i. nd ti ns (1), (2), and also (T), (2 ) are not satisfied for this 
system. Similarly, inequality (109) does not hold in this case — the sum 
of squares of non-diagonal elements is equal to 1.07. But 



= max(0.6 2 + 0 5 2 + 0.2 2 + 0.4 2 ; 


0 6 2 + 0.5 2 + 0.1 2 + 0.5 2 ; 0.2 2 + 0.4 2 + 0.1 2 + 0.5 2 ) = 0.87 < 1 


and therefore the system can be solved by the method of successive 
approximations. 
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Note that all the conditions formulated above are sufficient for the 
process of successive approximations to converge, but they are by no 
means necessary. Choosing different definitions of the distance 
between the points and writing down the contraction condition we 
arrive at new convergence conditions. However, we do not intend to 
go into this problem here. 

The remarks made in Sec. 5 apply also to systems of linear equa¬ 
tions. For instance, the result of the approximations does not depend 
on the initial approximation. So an error made in the course of the 
computation does not invalidate the subsequent computations, but 
only retards the progress towards the final result. 

Various forms of the method of successive approximations are used 
to solve systems of linear equations. Thus, in some methods after the 
approximate value x ( " +1) is found, it is substituted together with x { "\ 
X 4 \ x { " ] to find x ( 2 n+1} ; then x ( " +1 \ x { 2 + X) , xj 1 ,..., x^are substituted 

to find x ( 3 n+1} , and so on. The description of all the possible methods of 
approximations used for solving systems of linear equations could 
well form the subject of another book. 


27. Successive Approximations in Geometry 

We have described the application of the method of successive 
approximations to the solution of equations and systems of equations. 
The method is also applied to some problems of geometry, for in¬ 
stance, to the problem of computing the length of the circumference. 
As is well known, to compute the length of the circumference the prac¬ 
tice is to first find the perimeter of the inscribed square, next the peri¬ 
meter of the inscribed regular octagon, of a regular polygon with 16 
sides, and so on. The limit of these perimeters is equal to the length of 
the circumference. In the process each subsequent perimeter is com¬ 
puted with the aid of the preceding one. This is done in the following 
way. 

Denote the side of a regular 2"-gon by A n and its perimeter by P n . 
For instance, A 2 is the side of a square and therefore A 2 = R]/ 2, P 2 = 
= 4R]/2. Suppose we have already found P n . Then obviously 



In geometry it is proved that the side a 2n of a regular inscribed 2"-gon 
is expressed through the side a n of a regular inscribed n-gon and the 
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radius R of the circumference by the formula* 


a 


2n 



( 110 ) 


Consequently, the side A n+l of a regular inscribed 2 n+1 -gon is 
expressed in terms of the side A n of a regular 2"-gon by the formula 


Since A n = 




( 110 ') 


The sequence of numbers P 2 , P 3 , P„, .. tends to the length of the 


This formula is most easily derived using trigonometry. Obviously, if a n is 
the side of a regular inscribed n -gon and a 2n the side of a regular inscribed 
2 n-gon, then 


K K 

a„ = 2 R sin— and a 2n ~ 2R sin^— 
n 2 n 


(see drawing). Since 


it follows that 
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circumference, i.e. to the value 2kR. Therefore, formula (110) may be 
regarded as the formula for computing 2nR with the aid of the method 
of successive approximations. Using this method one may find the 
value of 7i to any number of decimal places. 

There is another method for the approximate computation of 
n called the method of equal perimeters. In this method a regular 2"-gon 
is replaced by a regular 2" + Lgon having the same perimeter. Denote 
the apothem of the regular 2"-gon by / m and the radius of the circums¬ 
cribed circle by r n . We denote the apothem of the regular 2" + ^gon 
having the same perimeter as the 2"-gon by /„+ v and the radius of the 
circle circumscribed around it by r n+l . 



0 


Fig 23 

Let AB (Fig. 23) be the side of the 2"-gon inscribed in the circle of 
radius r n . Connect the middle C of the arc AB with points A and B and 
draw the line DE joining the midpoints D and E of the sides AC and 
BC, respectively, of the triangle ACB. The angle DOE will, obviously, 
be equal to half the angle AOB. Therefore DE is a side of the regular 

2" + Lgon inscribed in a circle of radius OD. Since DE = -^->1B, the per¬ 
imeter of the 2" + ^gon is equal to the perimeter of the 2"-gon. This 
means that r n+1 — OD, l n+l — OK. 

It may easily be computed that 

l n+l =OK= (HI) 
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Then we find from the right-angled triangle ODC that 


r n+ 1 =1^X77 (112) 

Formulas (111) and (112) express r n+1 and /„ +1 in terms of r„ and /„. 

The perimeters of the polygons do not change with increasing n, 
and the numbers r n and /„ approach the same limit. This limit is equal 
to the radius of the circle whose length is equal to the perimeter of the 
polygons. If we choose the initial polygon so that its perimeter is equal 

to 2, then r n and /„ both approach the number —: 


1 1 

lim r = —•, lim L = — 
n _ b _ 

ao ^ n-» oo ^ 

For instance, if we choose for the first polygon a square with the 


side 1/2, we have 


r -Vi, _i 

r 2— A > l 2 — A' 


Therefore the following statement 


j/2 1 

is true: if we put r 2 — -, l 2 = — and compute the values r n+ «, l n+19 

4 4 

n = 2, 3, ..., using formulas (111) and (112), we would obtain 


1 

lim r n = lim /„ = — 

n-+ <x> n~* oo ^ 

These formulas may be used to find approximate values of —. In 

n 

order to do this one should continue the computation until the values 
of r n and /„ coincide within the accuracy desired. This common value 

of r n and /„ will be the value of — within the accuracy specified. 

n 


28. Conclusion 

This book has acquainted us with the applications of the method of 
successive approximations to various problems: to planning, to the 
extraction of roots, to the solution of equations, to the computation of 
the length of a circumference. This by no means exhausts the list of the 
various applications of this method. A great number of problems lead 
to differential equations (which contain derivatives of the unknown 
functions), to integral equations and to equations of a still more com- 
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plex kind. One of the most powerful methods for the approximate 
solution of these equations is the method of successive approxima¬ 
tions. Of course, its application in such cases is much more compli¬ 
cated than in the case of algebraic equations. But it can be said that if 
not for the method of successive approximations, not one of the enor¬ 
mous physical and technical problems which are tackled nowadays 
could be solved. For instance, the method is used in computing the 
motion of an artificial satellite, in designing an atomic reactor and in 
research into the structure of the atom. However, a discussion of the 
applications of the method of successive approximations outside the 
field of elementary mathematics would be beyond the scope of this 
book. 



Exercises 


For the reader to be able to test his memory of the methods of approximate 
solution of equations discussed in this book we present several examples of ap¬ 
proximate solutions of equations. 

Solve the following equations using the method of iteration*: 


1. x = 


1 


(x+l) 2 

2. x = (x + l) 3 

3 

3. x = 4 + 


x- 1 


x + 1 


4 . x = 2 ± j/x 

5. x = V / 5^ 


6. 4 — x = tan x 


7 . x 2 = sinx 

8 . x 3 = sinx 

. x+l 

9. x = arcsin- 

4 

10 . x = cos x 

1 

11. X = - 

cosx 

1 

12 . x = 1 H-sinx 

10 

13. x = ± j/log(x + 2) 

14. x 2 = ln(x + 1) 

15. lnx = 4 — x 2 

16. In x = 2 - x 


*In some of the examples the reader must first reduce the equation to the 
form x = (p (x). 
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17. x 2 = e x + 2 

18. logx = 0.1 x 

19. tanx = logx 

1 

20. x = — -e~ x 

10 

Solve the following equations using Newton’s method: 

21 . x 3 —5x + 1 =0 

22. x 3 — 9x 2 + 20x — 11 =0 

23. x 3 — 3x 2 — 3x + 11 =0 

24. x 5 + 5x + 1 - 0 

25. sinx + x = 1 

26. x 2 - 10 log x — 3 = 0 


27. Solve with an accuracy of 0001 the following systems of equations using 
the method of successive approximations: 


a) 


X = 

0 2y — O.lz 

+ 0 898 

y = 0 3x 

+ 0.15z 4- 1.383 

z = 0.25x 

— 0.4y 

+ 3.677 

x = — sin (x + y) + 0 336 



b) 


— — sin(x — y) + 0 362 


c) 



j/x + 2y — 0.710 


y = l/y —x+ 1 



Solutions 


1. Put <p(x) = 
1 


(*+ 1 ) 


Then <p' (x) = 


-2 


0 + x) 3 


We have <p(0) = 1 >0, 


(p(l) = — < 1. Therefore the interval [0, 1] contains a root of the equation. 

However, we cannot apply the method of successive approximations to this in¬ 
terval because |<p'(0)| = 2 > 1. To narrow down the interval note that 

<p (0.4) = —— > 0.4, and so the root of the equation lies in the interval [0.4, 1]. 
1.96 


If 0.4 ^x<; 1, |<p'(x)| ^ < 1 and therefore the method of successive 

approximations can be applied. Putting x I = 0.4 we obtain after 11 approxi¬ 
mation steps that x lt ~ <p(x t ,) ^ 04655 



Fig. 24 


Therefore with an accuracy of 0.0001 we have x — 0.4655. 

2. Put <p(x) = (x + l) 3 . Then tp'(x) = 3(x + l ) 2 and tp( — 2)- — 1 > —2, 
<p(— 3) = — 8 < — 3. Therefore the interval [-3,-2] contains the root of this 
equation. However, we cannot apply the method of successive approximations 
for in the interval [ — 3, — 2], |(p'(x)| > 1. Rewrite the equation in the form 

x=y~x~\ 


3 r— 1 

Then \J/(x) = l/x — 1 and = —j— In the interval [-3, — 2] w< 

31/x 1 

have |'K(x) ^ —— < 1 , and therefore we can use the method of 
3|/x 
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successive approximations. Putting Xj = - 2 , we obtain x 6 « ^(x 6 )« 
x - 2.325. Therefore with an accuracy of 0.001 we have x = - 2.325 

3 


3. Put <p (x) = 4 + 


l/ZEI 

! x + 1 


We have 


<p ' (x) = ■ i ) 

3{/(x-l) 2 (x+l ) 4 

Figure 24 shows that the straight line y = x intersects the curve y = 4 4 - 


*EI- 

/ x + 1 11 


in two points lying in the intervals [- 1 , 0 ] and [ 4 , 5 ], respecti- 

vely. On the interval [4, 5] we have |<p'(x)| ^ — 3 — < 1 . Putting x, * 4 we 

15J/45 

have x 3 % <p(x 3 ) % 4.870. Therefore with an accuracy of 0.001 we have x 3 = 
= 4.870. 



Fig 25 

In the interval [ — 1, 0] the method of successive approximations cannot 
be employed directly. Rewrite the equation in this section in the form (x — 

x — 1 x — 1 x — 1 

- 4) 3 =-, whence --— = x + 1 and x = --— - 1. Here \|/(x) «* 

x + 1 (x - 4y (x - 4) J 

x — 1 — 2 x — 1 

=-- — 1 and \|f ' (x) = — -—. Obviously, for - 1 ^ x ^ 0 we have 

J 1 v _ Al* 


(x-4 ) 3 


1 


(*-«)* 


I \|» ' (x) I ^ < 1 and we may now use the method of successive approxima- 

256 

tions. Putting X! = 0, we have x 2 ~ *|f(x 2 ) * — 0.9840. This means that with 
an accuracy of 0.0001 we have x = - 0.9840. 

We have found two roots: x= -0.9840, x = 4.870. 

4. Here (p^x) = 2 + j/x and <p 2 (*) = 2 — j/x It may be seen from Fig 25 


7 * 
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that the equation x = 2 + j/x has a root on the interval [3, 4]. In this 
interval 


<P'(*)| 



< 1 


Apply the method of successive approximations Putting x, =4, we have 
v 4 <p{v 4 ) ^ 3 353 Hence, with an accuracy of 0 001 we have v = 3 353 

Now solve the equation x = 2 — j/x. Its root is x = 1. Hence, the roots of 
the equation are 1 and 3 353 


5. Here tp(x) = f/5 — x and cp'(x) = 
cp(2) = t / 3> 1 

Therefore there is a root of the equation on the interval [ 1, 2]. In this inter¬ 
val |(p'(x)| ^ < 1. Putting x| = 1, we have x 5 ^ (p(x 5 )^ 1 516 Hence, 

with an accuracy of 0001 we have x= 1.516. 


3{/(S - _ x) 2 w e have <p(l) = ^<2, 



6. Write the equation in the form 

x = arctan(4 — x) 

Here cp (x) = arctan (4 — x). We have (p (1) = arctan 3 7 * 1 25, cp (2) = arctan 2 ^ 
l 10. It follows from this that the equation has a root lying on the interval 
[1, 2]. In this interval we have 


|<p'(x)| 


1 1 
l+fT-JcF^ 5 


Therefore the method of successive approximations is applicable Putting 
x, — 1, we have x 4 cp(x 4 )» 1.225. Therefore with an accuracy of 0.001 we 
have x = 1.225 

7, The given equation has a root x = 0 It may be seen from Fig 26 that the 
second root is positive. Therefore it satisfies the equation x = J/sin x. Here 
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cosx 


Since 


cp(x) = |/sinx and (p'(x)- 


21/sin j 




1 

>2 


and 


tp(l) = J/snT ~ j/08414 < 1 


it follows that the equation has a root in the interval 
we have 



In this interval 


<p'(v)| ^ 



0.8703 < 
I W4h < 


and hence the method of successive approximations converges Putting x { = 1, 
we obtain v 7 ^ tp( v-,) ^ 0 8768 Consequently, with an accuracy of 0 (X)01, the 
second root of the equation is 0 8768 



8. This equation is solved in the same way as the preceding one Rewriting 
the equation in the form 

x = |/sinx 

and putting x, = 1 we obtain x 6 % <p(x 6 ) ^ 0.9286. Therefore, with an accur¬ 
acy of 0.001 one of the roots of the equation is equal to 0 9286 Since both sides 
of the equation are odd functions, there is also another root equal to — 0.9286 
The third root is 0. 

x + 1 

9. The equation may be rewritten in the form —^— = sinx, 

- ic/2 ^ x ^ ti/ 2. It may be seen from Fig. 27 that this equation has a single 
root lying between 0 and ji/2. 
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In this case 


ip(x) = arcsin- 


x + 1 


ip'(x) = 


Y16 — (x + l) 2 


On the interval [0, ti/ 2] we have |<p'(x)| <y=-< 1. Putting x t = 0, we find 

x 8 « <p(x 8 )« 0.3422, and therefore with an accuracy of 0.0001 we have x = 

= 0.3422. , u . 

10. Since cos 0 = 1, cos 1 > 0, the equation x = cosx has a root on the in¬ 
terval [0, 11. Since |<p'(x)| <sin 1 < 1, the method of successive approxima¬ 
tions may be used. Putting x t = 1, we get x = 0.7391 with an accuracy of 
0.0001 



11. It may be seen from Fig. 28 that the positive roots of the equation lie 
close to the points of intersection of the graph of the function y = cosx with 

the x-axis and are to the right of the points of intersection of the type y + 

+ (2k -fl)7i and to the left of the points of intersection of the type ~ + 2/ctl 

To find the solution in the vicinity of the point x = y + nn put x — nn — y = 
= y. The equation will assume the form 


y + H7li-y=- 


; + nn + y^ 


(~i r 1 

sin y 


n n 

Since — —the equation may be written thus: 

y = ( — 1)" + 1 arcsin- 

71 

y + 7171 + — 

2 
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Here 


1 


<p(y) — ( — l) 1 arcsin 

71 

y + nn -f — 
2 


and 


9'W 


(- »r 



71 

+ Wl+ — 
2 


) 


Qcarly, in the vicinity of point y - 0, we have Jq>' (y) | < q < 1, and so we may 
use the method of successive approximations. Finathe solution for n — 1 with 
an accuracy of 0.001. Let y 0 — 0. Then y 2 ~ <p(y 2 ) w 0.204. Therefore y « 0.204. 
3 

Hence x = —tt + v * 4.917. 

2 

To find the first negative root, put n = — 1. We shall obtain the equation 


Put y 0 = 0. Then 


1 

y = arcsin- 

71 



y* * <p Cy 6 ) * - 0*503 


Hence y - 0.503 and so — 2074. 

For large values of |n| the method of successive approximations gives an 
approximate formula for y: 


y * <P(.yo) = ( - !)" +1 arcs*”-r 

mi + 2 


( - 1)"* 1 x 2 
(2 n + 1)ti 


Therefore 


x % y(2n + 1) + 


( - ir +1 X 2 

(2 n + 1)ti 


12. Putting X! = 0 we obtain x 3 <p(x 3 ) ^ 1.088. Therefore with an accu¬ 
racy of 0.001 we have x = 1.088. - , - 

13. First solve the equation x = )/log(x + 2\ We have <p(x) = |/log(x -f 2) 
and therefore 


lo « e 

<p (x) =- 7 - 

2(x -f 2) J/log (x -f 2) 

Since <p(0) = j/log2 > 0, <p (1) = j/log3 < 1, the equation has a root on the in¬ 
terval [0, l]. On this interval the inequality |<p'(x)l <q < 1 holds. Therefore 
the method of successive approximations is applicable. Putting x, = 1 we have 
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x 5 « (p(x 5 ) ^ 0.6507. Therefore the root of the equation x = |/log(x + 2) is 
equal to x = 06507 with an accuracy of 0.0001 
Consider the equation 

x = - |/Tog(x + 2) 

Here 

<P(*) = - l/log (x + 2) 

Since cp (0) = — j/log2 = — 0.55, cp ^ = — j/log 1.5 = - 0.42, the 

equation has a root on the interval ^——, oj . Putting x { =0 we find x 8 « 

^ (ptx 8 ) a— 0 4397. Hence, with an accuracy of 00001 we have x = 
= -0 4397 

14. One of the roots of the equation is x = 0 To find another root, write the 

equation in the form x = ± |/ln (x + 1). For the equation x ~ [/in (x + 1) we 
have 



cp(l) = l/ln 2 < 1 

Hence the equation has a root on the interval [1/2, 1] Since (p'(x) = 

=-- - we have |cp '(x)| < q < 1 on the interval [1/2, 1 ]. Put 

2 (x+ l)|/ln(x + 1 ) 

v| = 1, then Xq ^ tp(\ Q ) ^ 0 7469 Hence, with an accuracy of 0 0001 we have 
x = 0.7469 The equation x = — |/ln(x -t 1) has no roots other than x = 0. 
Hence x = 0 or x = 0.7469. 

15, Rewrite the equation in the form 

x = 1/4 — Inx 

Here 


cp(l) = 2 , cp( 2 ) = [/4-In 2 , ip'(x)= — - 

2x [/4 - In x 

Since cp ( 1 ) = 2, cp ( 2 ) = J/4 — In 2 < 2, the equation has a root on the interval 
[1, 2], It is evident from Fig. 29 that there are no other roots. Putting x, = 2 
we obtain 


*4 * <P(x 4 ) * I 841 

Hence x = 1.841 with an accuracy of 0.001. 

16. Write the equation in the form 

x = 2 — ln.x 


Here cp(x) = 2 - lnx, cp '(x) =-It may be seen from Fig 30 that the root 

x 
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of the equation lies on the interval [1, 2] In this interval |cp'(x)|<l Putting 
X! = 1 5 we obtain x M % <p(x 13 ) % 1 557. Hence, x = 1 557 with an accuracy of 
0 001 

17. It may be seen from Fig 31 that the equation has only one negative 
root Write the equation in the form 

x = — 1 fe x + 2 

Then 


tp(x)= - ]/e‘ + 2, <p'(x) = 


— e x 

2 ]/? + 2 



and 


Fig 29 


Fig 30 


tp( — 1) = — ]/2 + e 1 « - 1 54; tp( — 2) = - ]f~2 + e " 1 * - 1.46 

Therefore the root lies on the interval [ — 2, — 1] Putting Xj = — 1 we get 
x 4 ^ cp(x 4 ) -« - 1.492. Hence with an accuracy of 0.001 we have x = — 1 492 

18. Clearly one of the roots of the equation is x { = 10. To find the second 
root, write the equation in the form x = 10° lx . Here tt{x) = 10° lx , cp'(x) = 
= 0.1 x 10° lx lnl0 Also <p(l)= 10° 1 > 1, cp(2) = \0 6 J < 2 Therefore the 
equation has a root on the interval [1, 2]. In this interval |(p'(x)| ^0.1 x 
x 10 0,2 In 10 % 0 37 < 1. Now the method of successive approximations can 
be used. Putting x, = 2 we find x 7 ^ q>(x 7 ) ^ 1.372. Hence, with an accuracy of 
0.001 we have x = 1.372 


19. It may be seen from Fig. 32 that the equation has a root in each of the 


intervals- \-nn, - h(n+l)j 

2 2 


= 0,1,.. , with the roots lying in the right 


halves of the sections. To find the first positive root, substitute x = —- 

— y. The equation will assume the form 


cot y = log 
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from which wc get 


y = arccot 


since 0 < y < tl Here 


cp(y) — arccot 


Hr-')] 


and 


v _ -logg _ 

On the interval [0, rc] there is a root of our equation. Moreover, in this interval 
|<p'60| < 1- Apply tne method of successive approximations. Putting y , = 0 
we have y 4 & <p(y 4 ) * 1.059. Hence with an accuracy of 0.001 we have y = 
= 1.059 and therefore x = 3.654. 



To find the second positive root we put x = — tc — y. The equation assumes 
the form 

y = arccot j^log jc — y 

Putting y x = 0 we get y 4 ^ cp (y 4 ) ^ 0.870. Hence with an accuracy of 0 001 we 
have y = 0.870 and x = 6.984. 

20. It may be seen from Fig. 33 that the equation has a single root lying 

between 0 and 1. We have <p (x) = —e “ *, <p' (x) =- e ~ * On the interval 

10 10 

[0,1] the inequality |<p' (x) | < — holds, which makes the use of the method of 
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successive approximations possible. Putting Xj =0 we have x 4 « <p(x 4 ) * 
0.091. Hence with an accuracy of 0,0001 we have x — 0.091. 

21. Let 

f(x) = x 3 - 5x + 1 

Then 

/' (x) = 3x 2 - 5, f" (x) = 6x 




Fig 32 

Using Newton’s formula we have 


Fig 33 


ftw ~ + 1 

3P, 2 -5 


Compute a table of the values of the function: 


X 

-3 

-2 

-1 

0 

1 

2 

3 

/M 

-11 

3 

5 

1 

-3 

-1 

13 


It is seen from this table that the equation x 3 — 5x + 1=0 has roots on the in¬ 
tervals [ - 3, - 21 [0, 1], [2, 3J. 

First let us fincl the root lying on the interval [ — 3, — 2]. Since in this in¬ 
terval/"(x) < 0, we choose the initial value p 0 = - 3 (because /(p 0 ) = — 11 is 
a negative number). We have 


p. 


( — 3) 3 — 5( — 3) + 1 
3( - 3) 2 - 5 


-2.5 
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Continuing the computation we find that p ? % p 4 « - 2.331 and therefore 
with an accuracy of 0.001 the root of the equation on the interval [ — 3, — 2] is 
-2.331. 

Next we find the root lying on the interval [0, 1]. Here we have /"(x) ^ 0. 
Therefore we put P 0 = 0 Hence we obtain 

B, = 0 - ° 3 ~ s * 0 ± 1 = o,2, p 3 « 0.202. 

3 x 0 2 - 5 

With an accuracy of 0 001 we have x = 0.202. 

Finally, to find the root on the interval [2, 3] we put p o = 3 and get 


Pi = 3 - 


3 3 - 5 x 3 + 1 
3 x 3 2 - 5 


2.409 


Continuing the computation we find p 4 % p 5 % 2.128 Hence with an accuracy 
of 0.001 the root is equal to 2.128. We have found three roots: x ( = - 2.331; 
x 2 = 0.202; x 3 = 2.128. 

Solve this equation using the improved method of chords. On the interval 
[ — 3, - 2] we find 


a i = 


- 3 -/ 1 - 3 ) 


— 3 — (— 2) 
/(— 3) —/(— 2) 


-3 + 11^—- 2214 
— 14 


Since on this interval f"(x) <0, the curve is concave down and we find a 2 
using the formula 


a 2 =—3-/(-3) 


-3-(-2.214) 
/(“ 3) —/(- 2.214) 


Next we find 


- 2.293 


*3 


2293-f{ - 2 293) 


- 2 293 + 2 214 
/C — 2.293) — /1 — 2 214) 


2331 


This coincides within the accuracy of0001 with the value of x obtained above 
The same method is used to solve the equation on the intervals [0, 1] and 
[2, 3], 

22. Here we have 


f(x) = x 3 — 9x 2 + 20x — 11 
/' (x) = 3x 2 — 18x + 20 


/" (x) = 6x — 18 = 6(x — 3) 


Compile the table of the values of the function f(x): 


X 

0 

1 

2 

3 

4 

5 

6 

/(*) 

-11 

1 

1 

-5 

-11 

-11 

1 
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The roots of the equation lie on the intervals [0, 1], [2, 3], [5, 6]. 

On the interval [0, 1] we put P 0 — 0 and p 4 ^ p 5 ^ 0 834. On the in¬ 
terval [2, 3] we put P 0 = 3 and find P 2 ^ p 3 ^ 2.216 On the interval [5, 6] we 
put Pi = 0 and find p 4 ~ Ps ~ 5.249. We have found (with an accuracy of 
0.001) three roots of the equation: 

Xi =0.834, x 2 = 2.216, x 3 = 5.249 

23. Here/(x) = x 3 — 3x 2 — 3x + 1 l,/'(x) = 3x 2 — 6x — 3,/"(x) = 6x — 6 = 
= 6(x- 1). Compile a table of the values of f(x) 


X 

-2 

-1 

0 

1 

2 

3 

/(*) 

-3 

10 

11 

6 

1 

2 


The equation has one real root lying on the interval [ — 2, — 1]. To find this 
root we put p o = — 2. We obtain p 2 * p 3 ^ — 1.847. Therefore with an accur¬ 
acy of 0.001 we have x = — 1.847. 

24. Here/(x) = x 5 + 5x -I- 1 ,/'(x) = 5x 4 + 5,/"(x) = 20x 3 . The table of the 
/(x) values is 


X 

-1 

0 

1 

/w 

-5 

1 

7 


Therefore the equation has a root on the interval [-10] We put p 0 = 
= -l.With an accuracy of 0.0001 we have p 3 « p 4 « - 0 1999. Therefore with 
the accuracy specified x= — 0.1999. 

25. Here/(x) = sinx + x — l,/'(x) = cosx + l,/"(x) = — sinx. A table of 
the values of /(x) is 


X 

0 

1 

2 

/(x) 

-1 

08115 

1 9093 


The root lies on the interval [0, 1] Putting p 0 = 0 we find p 2 ^ P-< ^ 0 5110. 
Therefore with an accuracy 0.0001 we have x = 0.5110. 
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26. Here /(x) = x 2 - 10 log x - 3, /' (x) = 2x-—, /" (x) = 2 + 

x In 10 

10 

+ —r-. A table of the values of fix) is 

xMnlO ' 


X 

0.5 

1 

2 

3 

/(*) 

0.26 

_ ■> 

-2.01 

1.23 


The roots of the equation lie on the intervals [0.5, 1] and [2, 31 On the inter¬ 
val [0.5, 1] we put p 0 = 0.5 and obtain p 2 *s Pi ** 0.535. Therefore the corres¬ 
ponding root with an accuracy of 0.001 is equal to 0.535. On the interval [2, 3] 
we put p 0 = 3 and find P 2 ** P^ ** 2.705. 

The equation has two roots: x { =0.535; x 2 = 2.705. 

27. In the system (a) we put x 0 = 0, y 0 = 0, z 0 = 0 and after a few approxi¬ 
mations find with an accuracy of 0.001 that x = 1.021, y — 2.150, z — 3.072. 

In system (b) we put x 0 = 0, y 0 = 0 and after a few approximations find 
(with an accuracy of 0.001) that x = 0.520, y — 0.310. 

In system (c) we put x 0 = 0, y 0 = 0 and find x = 1.000, y = 2.000. 
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